3004-9US 



TITLE OF INVENTION: POLYENE POLYKETIDES, PROCESSES FOR 
THEIR PRODUCTION AND THEIR USE AS A PHARMACEUTICAL 

RELATED APPLICATIONS: 

[01] This application claims priority to U.S. Provisional Application 
60/441 ,123 filed January 21 , 2003; U.S. Provisional Application 60/494,568 
filed August 13, 2003; U.S. Provisional Application 60/469,810 filed May 13, 
2003; and U.S. Provisional 60/491,516 filed August 1, 2003. 

FIELD OF INVENTION: 

[02] This invention relates to a new class of polyene polyketides, their 
pharmaceutical^ acceptable salts and derivatives, and to methods for their 
production. One method of obtaining these novel polyketides is by cultivation 
of novel strains of Streptomyces aizunensis; another method involves 
expression of the biosynthetic gene cluster of the invention in transformed 
host cells. The compounds may also be produced by known strains of certain 
bacteria. The invention also encompasses the novel strains of Streptomyces 
aizunensis which produce these compounds, as well as the gene cluster 
which directs the biosynthesis of these compounds. The invention also 
includes the use of these novel polyketides and their pharmaceutical^ 
acceptable salts and derivatives as pharmaceuticals, in particular, to their use 
as inhibitors of fungal and bacterial cell growth, inhibitors of cancer cell growth 
and for lowering serum cholesterol and other steroids. The invention also 
encompasses pharmaceutical compositions comprising these novel 
polyketides, or pharmaceutical^ acceptable salts or derivatives thereof. 

BACKGROUND: 

[03] Actinomycetes comprise a family of bacteria that are abundant in soil 
and have generated significant commercial and scientific interest as a result of 
the large number of therapeutically useful antibiotics, antifungals, anticancer 
and cholesterol-lowering agents, produced as secondary metabolites by these 
bacteria. Many actinomycetes, particularly those of the Streptomyces genus, 
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have been extensively studied because of their ability to produce a notable 
diversity of biologically active metabolites. The intensive search for new 
natural products has led to the identification of new species of bacteria and 
the creation of improved strains. 

[04] Polyene polyketides are a group of natural products produced by 
actinomycetes that have generated significant commercial interest. For 
example Sakuda et a/, 1996 J. ofChem. Soc, Perkin trans. 1, 2315-19; and 
Sakuda ef a/., Tetrahedron Letters, Vol 35, No. 16, 2777-2789 (1995) disclose 
the linear polyene linearmycin A produced by a Streptomyces sp. Sakuda et 
al. report that linearmycin A has shown both antifungal and antibacterial 
activity. Pawlak et al. J of Antibiotics, Vol. XXXIII No. 9, 989-997 disclose the 
polyene macrolide lienomycin produced by Actinomyces 
diastatochromogenes. Pawlak etal. report that lienomycin has shown 
antifungal, antibacterial and anti-tumor activity. Antifungal activity of polyene 
macrolides has also been correlated with hyperchlesterolemic effect (CP. 
Schaffner, Polyene Microlides in Clinical Practice, in Macrolide Antibiotics: 
Chemistry, biology and practice, S. Omura, ed. Academic Press (1984), p. 
491; CP. Schaffner and H.W. Gordon, Proc Natl. Acad. Sci. U.S.A. 61, 36 
(1968)). 

[05] Polyketides have carbon chain backbones formed of two-carbon units 
through a series of condensations reactions and subsequent modifications. 
Type I polyketides are synthesized in nature by modular polyketide synthase 
(PKS) enzymes having a set of separate catalytic active sites for each cycle of 
carbon chain elongation and modification. Because of the multimodular 
nature of PKS proteins, much is known of the specificity and mechanism of 
the biosynthesis of polyketides. 

[06] Although many biologically active compounds have been identified, 
there remains the need to obtain novel naturally occurring compounds with 
enhanced properties. Current methods of obtaining such compounds include 
screening of natural isolates and chemical modification of existing 
compounds, both of which are costly and time consuming. Current screening 
methods are based on general biological properties of the compound, which 
require prior knowledge of the structure of the molecules. Methods for 
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chemically modifying known active compounds exist, but still suffer from 
practical limitations as to the type of compounds obtainable. 
[07] Thus, there exists a considerable need to obtain pharmaceutical^ 
active compounds in a cost-effective manner and with high yield. The present 
invention solves these problems by providing improved strains of 
Streptomyces aizunensis capable of producing potent new therapeutic 
compounds, as well as reagents (e.g. polynucleotides, vectors comprising the 
polynucleotides and host cells comprising the vectors) and methods to 
generate novel compounds by de novo biosynthesis rather than by chemical 
synthesis. 

SUMMARY OF THE INVENTION: 

[08] The present invention encompasses compounds of Formula I: 




CH3 



"Y 5 Y 7 




Y 15 y 11 




Formula I 



and pharmaceutical^ acceptable salts thereof; 
wherein, 

A is selected from the group consisting of -NR 1 R 2 , -N=CR 1 R 2 , 
NR 2 O 



\^ MUD 3 . ML-J 

R 4 - 



-NR 1/ ^ NHR 3 and -NH- 
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R\ R 2 , R 3 and R 4 are each independently selected from the group 
consisting of H, Ci- 6 alkyl, C 2 - 6 alkenyl, C 3 -6cycloalkyl, C 2 s heterocycloalkyl, 
aryl, heteroaryl and amino acid, wherein said alkyl, alkenyl, aryl and heteroaryl 
are optionally substituted with a group selected from halogen, OH, N0 2 , NH 2 
or aryl, said aryl being optionally further substituted with one or more groups 
independently selected from halogen, OH, N0 2 or NH 2 ; 

R 10 

B is selected from ethehe-1,2-diyl or 

wherein R 10 is oxo or OR 11 ; 

wherein R 11 is H or a heterocycloalkyl, the 
heterocycloalkyl being optionally substituted with 1-4 
substituents selected from OX, C1-3 alkyl and -0-C(0)R 1 , 
wherein X is H or, when there are at least two 
neighboring substituent groups that are OX, then the X 
can be a bond such that the two neighboring oxygen 
groups form a five-membered acetal ring of the formula: 




4J — u 

* « ; wherein R 5 and R 6 are each 

independently selected from the group consisting of H, 
C1-6 alkyl, and C2-7 alkenyl; 



D is selected from 




wherein 
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R 12 is selected from H and Ci_ 6 alkyl optionally substituted with 1 
to 2 phenyl groups, wherein the phenyl group is optionally 
substituted with Ci- 6 alkyl or halo; 

R 12a and R 12a are each indepedently selected from H, Ci- 6 alkyl, 
C2-6 alkenyl, C 3 -6cycloalkyl, C2-6heterocycloalkyl, aryl, heteroaryl 
and amino acid, wherein said alkyl, alkenyl, aryl and heteroaryl 
are optionally substituted with a group selected from halogen, 
OH, N0 2 , NH 2 or aryl, said aryl being optionally further 
substituted with one or more groups independently selected 
from halogen, OH, N0 2 or NH 2 ; 

W 1 is 
W 2 is 

W 3 is 



W 5 is 

X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 9 , X 12 and X 13 are each independently 
selected from H, -C(0)-R 7 and a bond such that when any of two neighboring 
X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 9 , X 12 and X 13 is a bond then the two 
neighboring oxygen atoms and their attached carbon atoms together form a 
six-membered acetal ring of the formula: 



R 5 R 6 




R 5 , R 6 and R 7 are each independently selected from H, Ci-e alkyl, 
C2-7 alkenyl; 



^x^x^ 



ox 12 ox 13 

CH 3 ; 
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Y 1 , Y 2 , Y 3 , Y 4 , Y 5 , Y 6 , Y 7 , Y 9 , Y 10 , Y 11 , Y 12 , Y 13 and Y 15 are each 
independently selected from the group consisting of ethene-1,2-diyl, 

%i 

ethane- 1 ,2-diyl and 5 ; wherein said ethene-1 ,2-diyl and 

ethane-1,2-diyl groups are optionally substituted with a methyl 
group; 



Z is selected from OH, NHR 1 




and when the dotted line 



is a bond then Z is oxo, or NR ; 

R 8 is selected from H, Ci. 6 alkyl, C 2 -e alkenyl; 
R 9 is Ci- 6 alkyl optionally substituted with aryl. 



[09] The invention is also directed to the Compound 2(a), a linear 
glycosylated polyketide with an amidohydroxycyclopentenone component, 
and pharmaceutical^ acceptable salts thereof: 




Compound 2(a) 

[010] The systematic name for Compound 2(a) has been determined to be: 
56-Amino-1 5,1 7,33,35,37,41 ,43,45,47,51 ,53-undecahydroxy-1 4,1 6,30- 
trimethyl-31-oxo-29-(3,4,5-trihydroxy-6-methyl-tetrahydro-pyran-2-yloxy)- 
hexapentaconta-2,4,6,8,12,18,20,22,24,26,38,48-dodecaenoic acid (2- 
hydroxy-5-oxo-cyclopent-1-enyl)-amide. 

[011] The invention encompasses pharmaceutical compositions of 
compounds of Formula I comprising, a therapeutically effective amount of the 
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compound of Formula I or a pharmaceutical^ acceptable salt thereof, and a 
pharmaceutical^ acceptable carrier. In particular, the invention is directed to 
pharmaceutical compositions of compound 2(a) comprising, a therapeutically 
effective amount of the compound 2(a) or a pharmaceutical^ acceptable salt 
thereof, and a pharmaceutical^ acceptable carrier. 
[012] The present invention is also directed to methods for producing the 
compound 2(a) and related compounds, including compounds of Formula I 
and Formula II as defined herein. Such methods comprise the steps of 
cultivating cells derived from a Streptomyces aizunensis strain, incubating 
said cultured cells aerobically in a growth medium for such time as is required 
for production of the desired compound, extracting said medium with a solvent 
such as methanol or ethanol and purifying the compound from the crude 
extract. The Streptomyces aizunensis strain which may be used in the 
methods of the invention may be NRRL B-1 1277 or a mutant thereof. A 
preferred strain of Streptomyces aizunensis useful in the methods of the 
invention is a mutant strain identified as [C03]023 (deposit accession number 
IDAC 070803-1); a most preferred strain of Streptomyces aizunensis useful in 
the methods of the invention is a mutant strain identified as [C03U03]023 
(deposit accession number IDAC 231203-02). The invention also 
encompasses the Streptomyces aizunensis strains identified by deposit 
accession numbers IDAC 070803-1 and IDAC 231203-02. 
[013] The invention also includes methods of inhibiting fungal cell growth, 
which comprise contacting a fungal cell with a compound of Formula I, a 
compound of Formula II or compound 2(a), or a pharmaceutical^ acceptable 
salt thereof. In addition, the invention encompasses methods for treating a 
fungal infection in a mammal, which comprise administering to a mammal 
suffering from such an infection, a therapeutically effective amount of a 
compound of Formula I, a compound of Formula II or compound 2(a), or a 
pharmaceutical^ acceptable salt thereof. The methods of the invention are 
particularly useful for treating fungal infections or inhibiting the growth of 
fungal cells in mammals caused by Candida albicans. The invention also 
encompasses methods for treating or inhibiting other types of fungal infections 
in a subject, wherein said fungal infections include those caused by Candida 
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sp. such as C. glabrata, C. lusitahiae C. parapsilosis, C. krusei, C. tropicalis, 
S. cerevisiae; Aspergillus sp. such as A. fumigatus, A. niger, A. terreus, A. 
flavus; Fusarium spp.; Scedosporium spp.; Cryptococcus spp.; Mucor ssp.; 
Histoplasma spp.; Trichosporon spp.; and Blaspomyces spp. Such methods 
comprise administering to a subject suffering from the fungal infection, a 
therapeutically effective amount of a compound of Formula I, Formula II or 
compound 2(a), or a pharmaceutical^ acceptable salt thereof. 
[014] The invention also provides methods of inhibiting cancer cell growth, 
which comprise contacting said cancer cell with a compound of Formula I, 
Formula II or compound 2(a), or a pharmaceutically acceptable salt thereof. 
The invention further encompasses methods for treating cancer in a subject, 
comprising administering to said subject suffering from said cancer, a 
therapeutically effective amount of a compound of Formula I, Formula II or 
compound 2(a) or a pharmaceutically acceptable salt thereof. Examples of 
cancers that may be treated or inhibited according to the methods of the 
invention include leukemia, non-small cell lung cancer, colon cancer, CNS 
cancer, melanoma, ovarian cancer, renal cancer, prostate cancer and breast 
cancer. 

[015] The present invention also provides the biosynthetic locus from 
Streptomyces aizunensis (NRRL B-11277) which biosynthetic locus is 
responsible for producing the compound of Formula 2(a). Streptomyces 
aizunensis was not previously reported to produce Compound 2(a). We have 
now discovered, in the Streptomyces aizunensis genome, the gene cluster 
responsible for the production of the Compound 2(a). Thus the invention 
provides polynucleotides and polypeptides useful in the production and 
engineering of compounds of Formula I and Compound 2(a). The invention 
also provides chemical modifications of compounds of Formula I and 
Compound 2(a). 

[016] In one aspect, the invention relates to the biosynthetic locus for 
production of a pofyketide of Formula I and provides, in one embodiment, an 
isolated, purified or enriched nucleic acid for production of a polyketide of 
Formula I comprising a nucleic acid encoding at least one domain of the 
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polyketide synthase system formed by the polyketide synthases of SEQ ID 
NOS: 21 , 23, 25, 27, 29, 31 , 33, 35 and 37. 

[017] In a further embodiment, the nucleic acid encodes one or more 
domains of the polyketide synthase of SEQ ID NO: 21 and comprises a 
nucleic acid selected from the group consisting of: a) SEQ ID NO: 22; b) the 
nucleic acid of residues 169-354 of SEQ ID NO: 22, the nucleic acid of 
residues 421-1698 of SEQ ID NO: 22, the nucleic acid of residues 1789-3093 
of SEQ ID NO: 22, the nucleic acid of residues 3910-4551 of SEQ ID NO: 22, 
the nucleic acid of residues 4807-4992 of SEQ ID NO: 22, the nucleic acid of 
residues 5068-6354 of SEQ ID NO: 22, the nucleic acid of residues 6403- 
7686 of SEQ ID NO: 22, the nucleic acid of residues 8497-9135 of SEQ ID 
NO: 22, the nucleic acid of residues 9388-9573 of SEQ ID NO: 22, the nucleic 
acid of residues 9643-10920 of SEQ ID NO: 22, the nucleic acid of residues 
10978-12267 of SEQ ID NO: 22, the nucleic acid of residues 12304-12624 of 
SEQ ID NO: 22, the nucleic acid of residues 13834-14487 of SEQ ID NO: 22, 
the nucleic acid of residues 14731-14916 of SEQ ID NO: 22, the nucleic acid 
of residues 15019-16314 of SEQ ID NO: 22, the nucleic acid of residues 
16378-17649 of SEQ ID NO: 22, the nucleic acid of residues 18439-19080 of 
SEQ ID NO: 22, the nucleic acid of residues 19330-19515 of SEQ ID NO: 22, 
the nucleic acid of residues 19585-20862 of SEQ ID NO: 22, the nucleic acid 
of residues 20935-22206 of SEQ ID NO: 22, the nucleic acid of residues 
23107-23754 of SEQ ID NO: 22, the nucleic acid of residues 24004-24189 of 
SEQ ID NO: 22; c) a nucleic acid having at least 80% identity to a nucleic acid 
of a) or b); and d) a nucleic acid complementary to a nucleic acid of a), b) or 
c). 

[01 8] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 23 and comprises a nucleic acid 
selected from the group consisting of: a) SEQ ID NO: 24; b) the nucleic acid of 
residues 109-1386 of SEQ ID NO: 24, the nucleic acid of residues 1477-2757 
of SEQ ID NO: 24, the nucleic acid of residues 2794-3114 of SEQ ID NO: 24, 
the nucleic acid of residues 4231-4881 of SEQ ID NO: 24, the nucleic acid of 
residues 5116-5301 of SEQ ID NO: 24, the nucleic acid of residues 5380- 
6645 of SEQ ID NO: 24, the nucleic acid of residues 6694-7977 of SEQ ID 
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NO: 24, the nucleic acid of residues 8878-9519 of SEQ ID NO: 24, the nucleic 
acid of residues 9772-9957 of SEQ ID NO: 24; c) a nucleic acid having at 
least 80% identity to a nucleic acid of a) or b); and d) a nucleic acid 
complementary to a nucleic acid of a), b) or c). 

[019] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 25 and comprises a nucleic acid 
selected from the group consisting of: a) SEQ ID NO: 26; b) the nucleic acid of 
residues 106-1383 of SEQ ID NO: 26, the nucleic acid of residues 1447-2721 
of SEQ ID NO: 26, the nucleic acid of residues 2755-3081 of SEQ ID NO: 26, 
the nucleic acid of residues 4315-4965 of SEQ ID NO: 26, the nucleic acid of 
residues 5206-5391 of SEQ ID NO: 26, the nucleic acid of residues 5491- 
6768 of SEQ ID NO: 26, the nucleic acid of residues 6841-8142 of SEQ ID 
NO: 26, the nucleic acid of residues 8941-9582 of SEQ ID NO: 26, the nucleic 
acid of residues 9832-10017 of SEQ ID NO: 26, the nucleic acid of residues 
1 0081 -11 358 of SEQ ID NO: 26, the nucleic acid of residues 1 1407-12675 of 
SEQ ID NO: 26, the nucleic acid of residues 1 3480-141 1 8 of SEQ ID NO: 26, 
the nucleic acid of residues 14383-14568 of SEQ ID NO: 26, the nucleic acid 
of residues 14638-15912 of SEQ ID NO: 26, the nucleic acid of residues 
15967-17244 of SEQ ID NO: 26, the nucleic acid of residues 17278-17598 of 
SEQ ID NO: 26, the nucleic acid of residues 18880-19530 of SEQ ID NO: 26, 
the nucleic acid of residues 19795-19980 of SEQ ID NO: 26; c) a nucleic acid 
having at least 80% identity to a nucleic acid of a) or b); and d) a nucleic acid 
complementary to a nucleic acid of a), b) or c). 

[020] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 27 and comprises a nucleic acid 
selected from the group consisting of: a) SEQ ID NO: 28; b) the nucleic acid of 
residues 103-1380 of SEQ ID NO: 28, the nucleic acid of residues 1450-2760 
of SEQ ID NO: 28, the nucleic acid of residues 3583-4218 of SEQ ID NO: 28, 
the nucleic acid of residues 4468-4653 of SEQ ID NO: 28; c) a nucleic acid 
having at least 80% identity to a nucleic acid of a) or b); and d) a nucleic acid 
complementary to a nucleic acid of a), b) or c). 

[021] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 29 and comprises a nucleic acid 
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selected from the group consisting of: a) SEQ ID NO: 30; b) the nucleic acid of 
residues 103-1380 of SEQ ID NO: 30, the nucleic acid of residues 1459-2754 
of SEQ ID NO: 30, the nucleic acid of residues 3655-4293 of SEQ ID NO: 30, 
the nucleic acid of residues 4540-4725 of SEQ ID NO: 30, the nucleic acid of 
residues 4804-6081 of SEQ ID NO: 30, the nucleic acid of residues 6136- 
7419 of SEQ ID NO: 30, the nucleic acid of residues 7456-7776 of SEQ ID 
NO: 30, the nucleic acid of residues 8938-9588 of SEQ ID NO: 30, the nucleic 
acid of residues 9832-10017 of SEQ ID NO: 30, the nucleic acid of residues 
10087-1 1364 of SEQ ID NO: 30, the nucleic acid of residues 1 1428-1271 1 of 
SEQ ID NO: 30, the nucleic acid of residues 12745-13065 of SEQ ID NO: 30, 
the nucleic acid of residues 14278-14928 of SEQ ID NO: 30, the nucleic acid 
of residues 15187-15372 of SEQ ID NO: 30; c) a nucleic acid having at least 
80% identity to a nucleic acid of a) or b); and d) a nucleic acid complementary 
to a nucleic acid of a), b) or c). 

[022] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 31 and comprises a nucleic acid 
selected from the group consisting of: a) SEQ ID NO: 32; b) the nucleic acid of 
residues 103-1380 of SEQ ID NO: 32, the nucleic acid of residues 1438-2742 
of SEQ ID NO: 32, the nucleic acid of residues 2776-3096 of SEQ ID NO: 32, 
the nucleic acid of residues 4267-4917 of SEQ ID NO: 32, the nucleic acid of 
residues 5209-5394 of SEQ ID NO: 32, the nucleic acid of residues 5464- 
6741 of SEQ ID NO: 32, the nucleic acid of residues 6787-8070 of SEQ ID 
NO: 32, the nucleic acid of residues 8107-8427 of SEQ ID NO: 32, the nucleic 
acid of residues 9562-10212 of SEQ ID NO: 32, the nucleic acid of residues 
10447-10632 of SEQ ID NO: 32, the nucleic acid of residues 10702-1 1979 of 
SEQ ID NO: 32, the nucleic acid of residues 12049-13326 of SEQ ID NO: 32, 
the nucleic acid of residues 13366-13686 of SEQ ID NO: 32, the nucleic acid 
of residues 14932-15582 of SEQ ID NO: 32, the nucleic acid of residues 
15853-16038 of SEQ ID NO: 32; c) a nucleic acid having at least 80% identity 
to a nucleic acid of a) or b); and d) a nucleic acid complementary to a nucleic 
acid of a), b) or c). 

[023] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 33 and comprises a nucleic acid 
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selected from the group consisting of: a) SEQ ID NO: 34; b) the nucleic acid of 
residues 103-1380 of SEQ ID NO: 34, the nucleic acid of residues 1441-2751 
of SEQ ID NO: 34, the nucleic acid of residues 3613-4248 of SEQ ID NO: 34, 
the nucleic acid of residues 4498-4683 of SEQ ID NO: 34, the nucleic acid of 
residues 4753-6030 of SEQ ID NO: 34, the nucleic acid of residues 6199- 
7515 of SEQ ID NO: 34, the nucleic acid of residues 8356-8994 of SEQ ID 
NO: 34, the nucleic acid of residues 9247-9432 of SEQ ID NO: 34; c) a 
nucleic acid having at least 80% identity to a nucleic acid of a) or b); and d) a 
nucleic acid complementary to a nucleic acid of a), b) or c). 
[024] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 35 and comprises a nucleic acid 
selected from the group consisting of: a) SED ID NO: 36; b) the nucleic acid of 
residues 118-1395 of SEQ ID NO: 36, the nucleic acid of residues 1507-2823 
of SEQ ID NO: 36, the nucleic acid of residues 2860-3180 of SEQ ID NO: 36, 
the nucleic acid of residues 4366-5016 of SEQ ID NO: 36, the nucleic acid of 
residues 5251-5436 of SEQ ID NO: 36, the nucleic acid of residues 5503- 
6780 of SEQ ID NO: 36, the nucleic acid of residues 6841-8154 of SEQ ID 
NO: 36, the nucleic acid of residues 8191-851 1 of SEQ ID NO: 36, the nucleic 
acid of residues 9562-10638 of SEQ ID NO: 36, the nucleic acid of residues 
10651-11301 of SEQ ID NO: 36, the nucleic acid of residues 11536-11721 of 
SEQ ID NO: 36, the nucleic acid of residues 1 1794-13071 of SEQ ID NO: 36, 
the nucleic acid of residues 131 17-14409 of SEQ ID NO: 36, the nucleic acid 
of residues 14443-14763 of SEQ ID NO: 36, the nucleic acid of residues 
15898-16548 of SEQ ID NO: 36, the nucleic acid of residues 16789-16974 of 
SEQ ID NO: 36, the nucleic acid of residues 17056-18333 of SEQ ID NO: 36, 
the nucleic acid of residues 18391-19671 of SEQ ID NO: 36, the nucleic acid 
of residues 19714-20034 of SEQ ID NO: 36, the nucleic acid of residues 
21 184-21834 of SEQ ID NO: 36, the nucleic acid of residues 22087-22272 of 
SEQ ID NO: 36; c) a nucleic acid having at least 80% identity to a nucleic acid 
of a) or b); and d) a nucleic acid complementary to a nucleic acid of a), b) or 
c). 

[025] In another embodiment the nucleic acid encodes one or more domains 
of the polyketide synthase of SEQ ID NO: 37 and comprises a nucleic acid 
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selected from the group consisting of: a) SEQ ID NO: 38; b) the nucleic acid of 
residues 100-1377 of SEQ ID NO: 38, the nucleic acid of residues 1504-2778 
of SEQ ID NO: 38, the nucleic acid of residues 2812-3132 of SEQ ID NO: 38, 
the nucleic acid of residues 4258-4908 of SEQ ID NO: 38, the nucleic acid of 
residues 5143-5328 of SEQ ID NO: 38, the nucleic acid of residues 5395- 
6672 of SEQ ID NO: 38, the nucleic acid of residues 6739-8019 of SEQ ID 
NO: 38, the nucleic acid of residues 8056-8376 of SEQ ID NO: 38, the nucleic 
acid of residues 9607-10257 of SEQ ID NO: 38, the nucleic acid of residues 
10537-10722 of SEQ ID NO: 38, the nucleic acid of residues 10945-11616 of 
SEQ ID NO: 38; c) a nucleic acid having at least 80% identical to a nucleic 
acid of a) or b); and d) a nucleic acid complementary to a nucleic acid of a), b) 
ore). 

[026] The invention also provides nucleic acids involved in the biosynthesis 
of a polyketide of Formula I other than those encoding a domain of the 
polyketide synthase system. In this embodiment, the invention provides an 
isolated, purified or enriched nucleic acid selected from the group consisting 
of: a) a nucleic acid of SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15, 17, 20, 40, 42, 44, 
46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 and 78; b) a 
nucleic acid encoding a polypeptide of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 
1 9, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 71 , 73, 75 and 
77; c) a nucleic acid having at least 75% identity to a nucleic acid of (a) or (b); 
and d) a nucleic acid complementary to a nucleic acid of (a), (b) or (c). 
[027] The invention further provides a nucleic acid that is hybridizable under 
stringent conditions to any one of the above nucleic acids and is substitutable 
for the nucleic acid to which it specifically hybridizes to direct the synthesis of 
a compound of Formula I. The invention further provides an isolated, purified 
or enriched nucleic acid comprising the sequence of at least two, preferably 
three, more preferably five, still more preferably 7 or more of the above 
nucleic acids. 

[028] The invention further provides an expression vector comprising any of 
the above nucleic acids. The invention further provides a host cell 
transformed with such an expression vector. 
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[029] In a further aspect, the invention provides a gene cluster for production 
of a polyketide of Formula I.. In one embodiment, the gene cluster may 
comprise at least ten, preferably twelve, more preferably fifteen, still more 
preferably twenty or more of the above nucleic acids. In a further 
embodiment, the gene cluster may include the nucleic acids of a cosmid 
selected from the cosmids deposited under I DAC accession nos. 250203-01, 
250203-02, 250203-03, 250203-04, and 250203-05. In a further embodiment, 
the deposited cosmids are inserted into a prokaryotic host for expressing a 
product. The host may be E co//, Streptomyces lividans, Streptomyces 
griseofuscus, Streptomyces ambofaciens, another species of Actinomycetes, 
or bacteria of the genus Bacillus, Corynebacteria, or Thermoactinomyces. In 
a further embodiment, the invention provides a nucleic acid which hybridizes 
under stringent hybridization conditions to the nucleic acids of the deposited 
cosmids and which encodes at least one protein involved in the biosynthesis 
of a polyene polyketide. In a further embodiment, the invention provides the 
isolated gene cluster from Streptomyces aizunensis encoding the biosynthetic 
pathway for the formation of compound 2(a), wherein said isolated gene 
cluster is the gene cluster formed by the deposited cosmids. 
[030] In another aspect, the invention relates to an isolated polypeptide for 
production of a polyketide of Formula I, and provides, in one embodiment, an 
amino acid sequence of a polyketide synthase domain of SEQ ID NO: 21, 
SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID 
NO: 31, SEQ ID NO: 33, SEQ ID NO: 35 and SEQ ID NO: 37. The domain 
may be a (3-ketoacyl synthase (KS) domain, an acyl carrier protein (ACP) 
domain, an acyl transferase (AT) domain, a ketoreductase (KR) domain, an 
enoyl reductase (ER) domain, a thioesterase (TE) domain or a dehydratase 
(DH) domain. In one embodiment, the domain is a KS domain and the amino 
acid comprises a sequence selected from the group consisting of the amino 
acid of residues 141 to 566 of SEQ ID NO: 21, residues 1690 to 21 18 of SEQ 
ID NO: 21 , residues 3215 to 3640 of SEQ ID NO: 21 , residues 5007 to 5438 
of SEQ ID NO: 21 , residues 6529 to 6954 of SEQ ID NO: 21 , residues 37 to 
462 of SEQ ID NO: 23, residues 1794 to 2215 of SEQ ID NO: 23, residues 36 
to 461 of SEQ ID NO: 25, residues 1831 to 2256 of SEQ ID NO: 25, residues 
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3361 to 3786 of SEQ ID NO: 25, residues 4880 to 5304 of SEQ ID NO: 25, 
residues 35 to 460 of SEQ ID NO: 27, residues 35 to 460 of SEQ ID NO: 29, 
residues 1602 to 2027 of SEQ ID NO: 29, residues 3363 to 3788 of SEQ ID 
NO: 29, residues 35 to 460 of SEQ ID NO: 31, residues 1822 to 2247 of SEQ 
ID NO: 31 , residues 3568 to 3993 of SEQ ID NO: 31 , residues 35 to 460 of 
SEQ ID NO: 33, residues 1585 to 2010 of SEQ ID NO: 33, residues 40 to 465 
of SEQ ID NO: 35, residues 1835 to 2260 of SEQ ID NO: 35, residues 3932 to 
4357 of SEQ ID NO: 35, residues 5686 to 61 1 1 of SEQ ID NO: 35, residues 
34 to 459 of SEQ ID NO: 37, residues 1799 to 2224 of SEQ ID NO: 37; and 
amino acid sequence having at least 75% identity to any one of the above 
amino acid residues. 

[031] In another embodiment, the domain is an ACP domain and the amino 
acid comprises a sequence selected from the group consisting of the amino 
acid of: residues 57 to 1 18 of SEQ ID NO: 21, residues 1603 to 1664 of SEQ 
ID NO: 21 , residues 31 30 to 31 91 of SEQ ID NO: 21 , residues 491 1 to 4972 
of SEQ ID NO: 21 , residues 6444 to 6505 of SEQ ID NO: 21 , residues 8002 to 
8063 of SEQ ID NO: 21, residues 1706 to 1767 of SEQ ID NO: 23, residues 
3258 to 3319 of SEQ ID NO: 23, residues 1736 to 1797 of SEQ ID NO: 25, 
residues 3278 to 3339 of SEQ ID NO: 25, residues 4795 to 4856 of SEQ ID 
NO: 25, residues 6599 to 6660 of SEQ ID NO: 25, residues 1490 to 1551 of 
SEQ ID NO: 27, residues 1514 to 1575 of SEQ ID NO: 29, residues 3278 to 
3339 of SEQ ID NO: 29, residues 5060 to 5124 of SEQ ID NO: 29, residues 
1 737 to 1 798 of SEQ ID NO: 31 , residues 3483 to 3544 of SEQ ID NO: 31 , 
residues 5285 to 5346 of SEQ ID NO: 31 , residues 1500 to 1561 of SEQ ID 
NO: 33, residues 3083 to 3144 of SEQ ID NO: 33, residues 1751 to 1812 of 
SEQ ID NO: 35, residues 3846 to 3907 of SEQ ID NO: 35, residues 5597 to 
5658 of SEQ ID NO: 35, residues 7363 to 7424 of SEQ ID NO: 35, residues 
1715 to 1776 of SEQ ID NO: 37, residues 3513 to 3574 of SEQ ID NO: 37, 
arid an amino acid sequence having at least 75% identity to any one of the 
above amino acid residues. - 

[032] In another embodiment, the domain is a AT domain and the amino acid 
comprises a sequence selected from the group consisting of the amino acid 
of: residues 597 to 1013 of SEQ ID NO: 21, residues 2135 to 2562 of SEQ ID 
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NO: 21 , residues 3660 to 4089 of SEQ ID NO: 21 , residues 5460 to 5883 of 
SEQ ID NO: 21 , residues 6979 to 7402 of SEQ ID NO: 21 , residues 493 to 
919 of SEQ ID NO: 23, residues 2232 to 2659 of SEQ ID NO: 23, residues 
483 to 907 of SEQ ID NO: 25, residues 2281 to 2714 of SEQ ID NO: 25, 
residues 3803 to 4225 of SEQ ID NO: 25, residues 5323 to 5748 of SEQ ID 
NO: 25, residues 484 to 920 of SEQ ID NO: 27, residues 487 to 918 of SEQ 
ID NO: 29, residues 2046 to 2473 of SEQ ID NO: 29, residues 3810 to 4237 
of SEQ ID NO: 29, residues 480 to 914 of SEQ ID NO: 31 , residues 2263 to 
2690 of SEQ ID NO: 31, residues 4017 to 4442 of SEQ ID NO: 31, residues 
481 to 917 of SEQ ID NO: 33, residues 2067 to 2505 of SEQ ID NO: 33, 
residues 503 to 941 of SEQ ID NO: 35, residues 2281 to 2718 of SEQ ID NO: 
35, residues 4373 to 4803 of SEQ ID NO: 35, residues 6131 to 6557 of SEQ 
ID NO: 35, residues 502 to 926 of SEQ ID NO: 37, residues 2247 to 2673 of 
SEQ ID NO: 37; and an amino acid sequence having at least 75% identity to 
any one of the above amino acid residues. 

[033] In another embodiment, the domain is a KR domain and the amino acid 
comprises a sequence selected from the group consisting of the amino acid 
of: residues 1304 to 1517 of SEQ ID NO: 21 , residues 2833 to 3045 of SEQ 
ID NO: 21 , residues 461 2 to 4829 of SEQ ID NO: 21 , residues 6147 to 6360 
of SEQ ID NO: 21 , residues 7703 to 7918 of SEQ ID NO: 21 , residues 141 1 to 
1627 of SEQ ID NO: 23, residues 2960 to 3173 of SEQ ID NO: 23, residues 
1439 to 1655 of SEQ ID NO: 25, residues 2981 to 3194 of SEQ ID NO: 25, 
residues 4494 to 4706 of SEQ ID NO: 25, residues 6294 to 6510 of SEQ ID 
NO: 25, residues 1 195 to 1406 of SEQ ID NO: 27, residues 1219 to 1431 of 
SEQ ID NO: 29, residues 2980 to 3196 of SEQ ID NO: 29, residues 4760 to 
4976 of SEQ ID NO: 29, residues 1423 to 1639 of SEQ ID NO: 31, residues 
3188 to 3404 of SEQ ID NO: 31, residues 4978 to 5194 of SEQ ID NO: 31, 
residues 1205 to 1416 of SEQ ID NO: 33, residues 2786 to 2998 of SEQ ID 
NO: 33, residues 1456 to 1672 of SEQ ID NO: 35, residues 3551 to 3767 of 
SEQ ID NO: 35, residues 5300 to 5516 of SEQ ID NO: 35, residues 7062 to 
7288 of SEQ ID NO: 35, residues 1420 to 1636 of SEQ ID NO: 37, residues 
3203 to 3419 of SEQ ID NO: 37; and an amino acid sequence having at least 
75% identity to any one of the above amino acid residues. 
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[034] In another embodiment, the domain is a DH domain and the amino acid 
comprises a sequence selected from the group consisting of the amino acid 
of: residues 4102 to 4208 of SEQ ID NO: 21, residues 932 to 1038 of SEQ ID 
NO: 23, residues 919 to 1027 of SEQ ID NO: 25, residues 5761 to 5866 of 
SEQ ID NO: 25, residues 2486 to 2592 of SEQ ID NO: 29, residues 4249- 
4355 of SEQ ID NO: 29 residues 926 to 1032 of SEQ ID NO: 31, residues 
2703 to 2809 of SEQ ID NO: 31 , residues 4456 to 4562 of SEQ ID NO: 31 , 
residues 954 to 1060 of SEQ ID NO: 35, residues 2731 to 2837 of SEQ ID 
NO: 35, residues 4815 to 4921 of SEQ ID NO: 35, residues 6572 to 6678 of 
SEQ ID NO: 35, residues 938 to 1044 of SEQ ID NO: 37; residues 2686 to 
2792 of SEQ ID NO: 37; and an amino acid sequence having at least 75% 
identity to any one of the above amino acid residues. 
[035] In another embodiment, the domain is an ER domain and the amino 
acid comprises a sequence selected from the group consisting of the amino 
acid of: residues 31 88 to 3546 of SEQ ID NO: 35 and any amino acid 
sequence having at least 75% identity to residues 3188 to 3546 of SEQ ID 
NO: 35. 

[036] In another embodiment, the domain is an TE domain and the amino 
acid comprises a sequence selected from the group consisting of the amino 
acid of: residues 3649 to 3872 of SEQ ID NO: 37, and any amino acid 
sequence having at least 75% identity to residues 3649 to 3872 of SEQ ID 
NO: 37. 

[037] In another embodiment, the invention provides a polypeptide involved 
in the biosynthesis of a polyketide of Formula I other than a polypeptide 
encoding a domain of the polyketide synthase system of the invention. In this 
embodiment, the invention provides an isolated polypeptide for the production 
of a polyketide of Formula I selected from the group consisting of: a) SEQ ID 
NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 
61 , 63, 65, 67, 69, 71 , 73, 75 and 77; and b) a polypeptide which is at least 
75% identical to SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 39, 41, 43, 45, 47, 
49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75 and 77. 
[038] In another aspect, the invention provides a method of making a 
polypeptide having a sequence selected from the group consisting of SEQ ID 
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NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 
43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75 and 77 
comprising the steps of: (a) introducing a nucleic acid encoding said 
polypeptide, said nucleic acid being operably linked to a promoter, into a 
bacterial host cell; and (b) culturing the transformed host cell under conditions 
which result in the expression of the polypeptide. 

[039] In another aspect the invention is drawn to a method for increasing the 
yield of the polyketides of the invention using the deposited cosmids of the 
nucleic acids described above, said method comprising the steps of 
transforming a prokaryotic host with cosmids or nucleic acids and culturing the 
transformed prokaryotic host under conditions which result in the expression 
of the polyketide. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[040] Figure 1 : Diagram of the biosynthetic locus for compound 2(a) from 
Streptomyces aizunerisis. Also indicated are the positions of cosmids 
depositedunder IIDAC accession numbers 250203-01, 250203-02, 250203- 
03, 250203-04 and 250203-05, which span the locus of compound 2(a). 
[041] Figure 2a-d: Multiple amino acid alignment comparing the 26 KS 
domains present in the polyketide synthase (PKS) for compound 2(a) (ORFs 
10 to 18). The boundaries and key residues (highlighted in black) of the KS 
domains were chosen as described by Kakavas ef a/., J. BacteriolMQ, 7515- 
7522(1997). 

[042] Figure 3a-d: Multiple amino acid alignment comparing the 26 AT 
domains present in the compound 2(a) PKS (ORFs 10 to 18). The boundaries 
and key residues (highlighted in black) of the AT domains were chosen as 
described by Kakavas etaL, supra. 

[043] Figure 4: Multiple amino acid alignment comparing the 15 DH domains 
present in the compound 2(a) PKS (ORFs 1 0, 1 1 , 1 2, 1 4, 1 5, 1 7 and 1 8). The 
boundaries and key residues (highlighted in black) of the DH domains were 
chosen as described by Kakavas etal supra. The inactive DH domains are 
highlighted. 

[044] Figure 5: Amino acid alignment comparing the ER domain present in 
the compound 2(a) PKS (ORF 17) with the ER domains from modules 5 and 
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15 in the nystatin biosynthetic locus as described by Brautaset et a/., Chem. 
Biol., 7, 395-403 (2000). The boundaries and key residues (highlighted in 
black) of the ER domain were chosen as described by Kakavas et al. supra. 
[045] Figure 6a and 6b: Multiple amino acid alignment comparing the 26 KR 
domains present in the compound 2(a) PKS (ORFs 10 to 18). The boundaries 
and key residues (highlighted in black) of the KR domains were chosen as 
described by Kakavas et al. supra, and Fisher et al. Structure Fold Des. 8, 
339-347 (2000). The inactive KR domain found in ORF 13/module 12 is 
highlighted. 

[046] Figure 7: Multiple amino acid alignment comparing the 27 ACP 
domains present in the compound 2(a) PKS (ORFs 10 to 18). The boundaries 
and key serine residues (highlighted in black) of the ACP domains were 
chosen as described by Kakavas et al. supra. 

[047] Figure 8: Amino acid alignment comparing the TE domain present in 
the compound 2(a) PKS (ORF 18) with the TE domain from module 7 in the 
nystatin biosynthetic locus as described by Brautaset etal. supra. The 
boundaries and key residues (highlighted in black) of the ER domain were 
chosen as described by Kakavas et al. supra. 
[048] In each of the clustal alignments (Figs 2 to 8) a line below the 
alignment is used to mark strongly conserved positions. In addition, three 
characters, namely * (asterisk), : (colon) and . (period) are used, wherein "*" 
indicates positions which have a single, fully conserved residue; ":" indicates 
that one of the following strong groups is fully conserved: STA, NEQK, NHQK, 
NDEQ, QHRK, MILV, MILF, HY, and FYW; and V indicates that one of the 
following weaker groups is fully conserved: CSA, ATV, SAG, STNK, STPA, 
SGND, SNDEQK, NDEQHK, NEQHRK, FVLIM, and HFY. 
[049] Figure 9: Phylogenetic analysis of the 26 AT domains present in the 
compound 2(a) PKS (ORFs 10 to 18) along with a malonyl-specific and a 
methylmalonyl-specific AT domain present in modules 3 and 1 1 respectively 
of the nystatin PKS system as described by Brautaset et al. supra. 
[050] Figure 10a to 10c: biosynthetic pathway for compound 2(a) polyketide 
core structure. 
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[051] Figure 1 1a and 11b: biosynthetic pathways for compound 2(a) 
aminohydroxy-cyclopentenone (a) and deoxysugar (b) components. 
[052] Figures 12a to 12f: outline of strategies for the genetic modification of 
locus for compound 2(a) providing for variants that functionally modify 
compound 2(a). 

[053] Figure 1 3: shows the data for the compound of compound 2(a) 
obtained by electrospray mass spectrometry. 
[054] Figure 14: shows the data for the compound of compound 2(a) 
obtained by UV A max . 

[055] Figure 15: shows the data obtained for the compound of compound 

2(a) by NMR at 500 MHz dissolved in d 3 -MeOH including proton 15 A, carbon 

15 B, and multidimensional pulse sequences gDQCOSY, gHSQC, gHMBC, 

and TOCSY 15 C, 15D, 15E and 15F, respectively. 

[056] Figure 1 6: is a plot of the data from a study to evaluate the antifungal 

activity of compound 2(a) against Candida albicans in a mouse model as 

described in Example 5. Figure 16 depicts the percent survival versus days 

post-inoculation with compound 2(a) (3 mg/kg), compound 2(a) (1 mg/kg), 

Fungizone (0.25 mg/kg) and Fungizone (0.50 mg/kg). 

[057] Figure 17: proton-NMR (Figure 17A) and carbon-1 3 NMR (Figure 17B) 

spectral assignments for Compound 2(a) as discussed in Example 3. 

DETAILED DESCRIPTION OF THE INVENTION 

[058] The present invention encompasses compounds of Formula I, and 
pharmaceutical^ acceptable salts thereof: 
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w 5 



CH3 

v 15 V 11 V13 




Formula I 
wherein, 

A is selected from the group consisting of -NR 1 R 2 , -N=CR 1 R 2 , 
NR 2 



-NR 1 ^ ^NHR 3 and -NH^^R 

R 1 , R 2 , R 3 and R 4 are each independently selected from the 
group consisting of H, Ci- 6 alkyl, C 2 . 6 alkenyl, C 3 -6cycloalkyl, C 2 . 6 
heterocycloalkyl, aryl, heteroaryl and amino acid, wherein said alkyl, alkenyl, 
aryl and heteroaryl are optionally substituted with a group selected from 
halogen, OH, NO2, NH 2 or aryl, said aryl being optionally further substituted 
with one or more groups independently selected from halogen, OH, N0 2 or 
NH 2 ; 

R 10 



4 



B is selected from ethene-1 ,2-diyl or 



wherein R 10 is oxo or OR 11 ; 

wherein R 11 is H or a heterocycloalkyl, the 
heterocycloalkyl being optionally substituted with 1 -4 
substituents selected from OX, C1-3 alkyl and -0-C(0)R 1 , 
wherein X is H or, when there are at least two 
neighboring substituent groups that are OX, then the X 
can be a bond such that the two neighboring oxygen 
groups form a five-membered acetal ring of the formula: 
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wherein 



• * s ; wherein R 5 and R 6 are each 

independently selected from the group consisting of H, 
C-l-6 alkyl, and C2-7 alkenyl; 




D is selected from: . o^ , -NR 12a R 12a , and OR 12 



R 12 is selected from H, C1.6 alkyl optionally substituted with 1 to 
2 phenyl groups, wherein the phenyl group is optionally 
substituted with Ci- 6 alkyl and halo; 

R 12a and R 12a are each indepedently selected from H, Ci- 6 alkyl, 
C2-6 alkenyl, C 3 -6cycloalkyl, C2-6heterocycloalkyl, aryl, heteroaryl 
and amino acid, wherein said alkyl, alkenyl, aryl and heteroaryl 
are optionally substituted with a group selected from halogen, 
OH, N0 2 , NH 2 or aryl, said aryl being optionally further 
substituted with one or more groups independently selected 
from halogen, OH, N0 2 or NH 2 ; 



ox 1 ox 2 

Wis M 

px 3 ox 4 ox 5 OX 6 

W 2 is 




ox 7 ox 8 ox 9 
Wis 
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X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 9 , X 12 and X 13 are each independently 
selected from H, -C(0)-R 7 and a bond such that when any of two neighboring 
X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 9 , X 12 and X 13 is a bond then the two 
neighboring oxygen atoms and their attached carbon atoms together form a 
six-membered acetal ring of the formula: 



R 5 R 6 




R 5 , R 6 and R 7 are each independently selected from H, Ci .6 alkyl, 
C2-7alkenyl; 

Y\ Y 2 , Y 3 , Y 4 , Y 5 , Y 6 , Y 7 , Y 9 , Y 10 , Y 11 , Y 12 , Y 13 and Y 15 are each 
independently selected from the group consisting of ethene-1 ,2-diyl, 

ethane- 1 ,2-diyl and or 5 , wherein said ethene-1 ,2-diyl and 
ethane- 1 ,2-diyl groups are optionally substituted with a methyl 
group; 



Z is selected from OH, NHR 1 




and when the dotted line 



is a bond then Z is oxo, or NR ; 

R 8 is selected from H, Ci- 6 alkyl, C 2 -e alkenyl; 
R 9 is Ci- 6 alkyl optionally substituted with aryl. 

[058] In a first embodiment the invention provides compounds of Formula I 
wherein Z is oxo; and all other groups are as previously defined; or a 
pharmaceutical^ acceptable salt thereof. 
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[059] Within this first embodiment Z is oxo, A is -NfVR 2 ; and all other groups 
are as previously defined; or a pharmaceutically acceptable salt thereof. 
[060] Further within this embodiment Z is oxo, A is - NF^R 2 ; and D is 



p , un 



of ; and all other groups are as previously defined; or a 

pharmaceutically acceptable salt thereof. 

[061] Within the first embodiment the invention provides compounds of 
Formula I wherein Z is oxo and A is 
O 

-NH R 4 ; anc j a || 0 ther groups are as previously defined; or a 
pharmaceutically acceptable salt thereof. 




[062] Further within this embodiment Z is oxo and A is -NH^ ^ Ft anc j □ j s 




or ; and all other groups are as previously defined; or a 

pharmaceutically acceptable salt thereof. 

[063] In a second embodiment the invention provides compounds of Formula 
I wherein B is 

R 10 




wherein R 10 is oxo or OR 11 ; and all other groups are as 
previously defined; or a pharmaceutically acceptable salt thereof. 
[064] Within this second embodiment R 10 is OR 1 - 1 , wherein R 11 is a 
heterocycloalkyl, the heterocycloalkyl being optionally substituted with 1-4 
substituents selected from OX, C1.3 alkyl and -0-C(0)R 1 , wherein X is H or, 
when there are at least two neighboring substituent groups that are OX, then 
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the X can be a bond such that the two neighboring oxygen groups form a five- 
membered acetal ring of the formula: 




[065] Within this embodiment R 11 is a heterocycloalkyl, the heterocycioalkyl 
being optionally substituted with 1 -4 substituents selected from OX, C1-3 alkyl 
and -0-C(0)R 1 , wherein X is H or, when there are at least two neighboring 
substituent groups that are OX, then the X can be a bond such that the two 
neighboring oxygen groups form a five- membered acetal ring of the formula: 



previously defined; or a pharmaceutical^ acceptable salt thereof. 
[066] Further within this embodiment the invention provides compounds of 
Formula I, wherein R 11 is a heterocycloalkyl, the heterocycloalkyl being 
optionally substituted with. 1-4 substituents selected from OX, C1-3 alkyl and - 
0-C(0)R 1 , wherein X is H or, when there are at least two neighboring 
substituent groups that are OX, then the X can be a bond such that the two 
neighboring oxygen groups form a five-membered acetal ring of the formula: 



previously defined; or a pharmaceutical^ acceptable salt thereof. 

[067] Preferred compounds of the invention comprise compounds of Formula 

II: 




and A is -NR 1 R 2 ; and all other groups are as 




, A is -NR 1 R 2 and Z is oxo; and all other groups are as 



OR 20 




Formula II 
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wherein A 1 is -NH 2 , -N=CH-R 13 , amino acid or -NH-R 14 , wherein R 13 is 
hydrogen or phenyl and R 14 is selected from the group consisting of isopropyl, 
1-(4-nitrophenyl)methyl, cyclohexyl, and wherein said amino acid is attached 
via its nitrogen atom; 




wherein R 15 is selected from the group consisting of methyl, isopropyl, phenyl, 
4-nitrophenyl, 1-aminoethyl, 1-amino-1-(4-hydroxyphenyl)methyl, 1-amino-2- 
(4-hydroxyphenyl)ethyl, 1-amino-2-methylpropyl, 2-pyrrolidinyl and1-amino-2- 
hydroxyethyl; 

Y 20 is selected from the group consisting of ethene-1 ,2-diyl and 



Z 1 is selected from the group consisting of: 




R is selected from the group consisting of hydrogen and 
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Y is ethene-1,2-diyl or ethane-1,2-diyl; and 
D 1 is hydroxy, methoxy or 




and pharmaceutical^ acceptable salts thereof. 

[068] The present invention includes pharmaceutical compositions of the 
compounds of Formula II, said compositions comprising a therapeutically 
effective amount of the compound of Formula II or a pharmaceutical^ 
acceptable salt thereof, and a pharmaceutically acceptable carrier. 
[069] Particularly preferred compounds of the present invention include those 
of Formula II 



OR 20 




Formula II 



wherein A 1 is amino (-NH 2 ), and Y 20 , Z 1 , R 20 , Y 30 and D 1 are as defined in 
Table A below. 



Table A. Compounds of Formula II wherein A 1 is NH 2 



Compound 


y20 




R 20 


y30 


D 1 


2(a) 


ethene-1,2- 
diyl 




3,4,5- 

trihydroxy-6- 
methyl- 
tetrahydro- 
pyran-2-yl 


ethane-1 ,2- 
diyl 




2(b) 








(( 




2(c) 


ethene-1,2- 
diyl 


OH 




u 




2(d) 


u 






u 




2(e) 


It 


xV 




a 




2(f) 


If 


NH-< 




ii 
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2(g) 


« 




it 


» 


hydroxy 


2(h) 




u 






methoxy 


2(i) 


a 


tt 


hydrogen 


ti 




20) 


CI 


tt 


ii 


ti 


hydroxy 


2(k) 


u 


it 


3,4,5- 

trihydroxy-6- 
methyl- 
tetrahydro- 
pyran-2-yl 


ethene-1 ,2- 
diyl 




2(1) 


u 


OH 


ti 


ti 





Additional preferred compounds of the invention include compounds of 
Formula II 




Formula II 



as set forth in Tables B and C below, 



wherein Y 20 is ethene-1 ,2-diyl; 

OH 




Y is ethane-1 ,2-diyl; and 




wherein A 1 is -N=CH-R 13 (Table B); -NH-R 14 (Table C). 
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Table B. Compounds of Formula II wherein A 1 is -N=CH-R 13 and Y 20 , Z\ 
R 20 , Y 30 and D 1 are as defined above. 



Compound 


R ia 


2(m) 


CH 3 


2(n) 


phenyl 



Table C. Compounds of Formula II wherein A 1 is -NH-R 14 and Y 20 , Z 1 , R 20 , 
Y 30 and D 1 are as defined above. 



Compound 


R 14 


R 15 


2(o) 


• NH 

rl 


NA" 


2(p) 


isopropyl 


NA 


2(q) 


1 -(4-nitrophenyl)methyl 


NA 


2(r) 


cyclohexyl 


NA 


2(s) 


i 

^ R 15 


CH 3 


2(t) 




isopropyl 


2(u) 


I 

^\ R15 . 


phenyl 


2(v) 




4-nitrophenyl 


2(w) 




1-aminoethyl 


2(x) 


1 ' 


1 -amino- 1 -(4- 
hydroxyphenyl)methyl 


2(y) 




1-amino-2-(4- 
hydroxyphenyl)ethyl 


2(z) 


o 

■ R 1S 


1 -amino-2-methylpropyl 


2(aa) 




2-pyrrolidinyl 


2(ab) 


O 

^ R ,5 


1 -amino-2-hydroxyethyl 



*NA = not applicable 



The compounds of Tables A, B and C are shown below. 

29 



3004-9US 



OH 




CH 3 CH 3 CH, 

Compound 2(a) 




Compound 2(b) 




Compound 2(c) 
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Compound 2(g) 




Compound 2(h) 




Compound 2(k) 




Compound 2(m) 
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OH 




Compound 2(o) 




Compound 2(p) 




Compound 2(q) 




Compound 2(r) 



OH 




Compound 2(s) 
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Compound 2(t) 





Compound 2(x) 




Compound 2(y) 



33 



Compound 2(z) 




Compound 2(aa) 




Compound 2(ab) 



[070] The following bivalent moieties are referred to herein by the 
nomenclature as indicated below: 



O 

A 



OH 



A 



1 -oxo-methylene- 1 , 1 -diyl 



1 -hydroxymethylene-1 , 1 -diyl 



1 ,3-dioxacyclopentane-2,2-diyl 



NH 

A 




(2-propylamino)methylene-1,1-diyl 
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N-CH 2 -Q 

1 -benzyliminomethylene-1 , 1 -diyl 



oxirane-2,3-diyl. 

The following monovalent moieties are referred to herein by the nomenclature 
as indicated: 

OH 



(2-hydroxy-5-oxo-cyclopent-1-enyl)-amino 



OH 




3,4,5-trihydroxy-6-methyl-tetrahydropyran-2-yl. 



[071] The terms "polyketide" or "polyene polyketide" refer to a class of 
polyketide compounds defined by Formula I or II. A preferred polyketide of 
the invention is the compound 2a, having the systematic name 56-Amino- 
15,17,33,35,37,41,43,45,47,51 ,53-undecahydroxy-1 4,1 6,30-trimethyl-31 -oxo- 
29-(3,4,5-trihydroxy-6-methyl-tetrahydro-pyran-2-yloxy)-hexapentaconta- 
2,4,6,8,12,1 8,20,22,24,26,38,48-dodecaenoic acid (2-hydroxy-5-oxo- 
cyclopent-1-enyl)-amide. The term further includes compounds of this class 
that can be used as intermediates in chemical synthesis. 
[072] The terms "producer of compounds of Formula I" and "compounds of 
Formula I -producing organism" refer to a microorganism that carries genetic 
information necessary to produce a compound of Formula I, whether or not 
the organism is known to produce a compound of Formula I. The terms 
"producer of compounds of Formula II" and "compound of Formula II- 
producing organism" refer to a microorganism that carries genetic information 



A 
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necessary to produce a compound of Formula II, whether or not the organism 
is known to produce a compound of Formula II. The terms "producer of 
Compound 2(a)" and "Compound 2(a)-producing organism" refer to a 
microorganism that carries genetic information necessary to produce 
Compound 2(a), whether or not the organism is known to produce Compound 
2(a). The term "polyketide producer" refer to a microorganism that carries 
genetic information necessary to produce a polyketide of Formula I or II. The 
terms apply equally to organisms in which the genetic information to produce 
the compound of Formula I or II or Compound 2(a) is found in the organism as 
it exists in its natural environment, and to organisms in which the genetic 
information is introduced by recombinant techniques. For the sake of 
particularity, specific organisms contemplated herein include organisms of the 
family Micromonosporaceae, of which preferred genera include 
Micromonospora, Actindplanes and Dactylosporangium] the family 
Streptomycetaceae, of which preferred genera include Streptomyces and 
Kitasatospora] the family Pseudonocardiaceae, of which preferred genera are 
Amycolatopsis and Saccharopolyspora; and the family Actinosynnemataceae, 
of which preferred genera include Saccharothrix and Actinosynnema; however 
the terms are intended to encompass all organisms containing genetic 
information necessary to produce a compound of Formula I or II or Compound 
2(a). Preferred producers of a compound of formula I or II or Compound 2(a) 
include Streptomyces aizunensis (NRRL B-1 1277) and any mutant or 
improved strain of Streptomyces aizunensis, including strain [C03]023 (IDAC 
accession no. 070803-01) and strain [C03U03]023 (IDAC accession no. 
231203-02). 

[073] The term "isolated" means that the material is removed from its original 
environment, e.g. the natural environment if it is naturally-occurring. For 
example, a naturally occurring polynucleotide or polypeptide present in a living 
organism is not isolated, but the same polynucleotide or polypeptide, 
separated from some or all of the coexisting materials in the natural system, is 
isolated. Such polynucleotides could be part of a vector and/or such 
polynucleotides or polypeptides could be part of a composition, and still be 
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isolated in that such vector or composition is not part of its natural 
environment. 

[074] The term "purified" does not require absolute purity; rather, it is 
intended as a relative definition. Individual nucleic acids obtained from a 
library have been conventionally purified to electrophoretic homogeneity. The 
purified nucleic acids of the present invention have been purified from the 

4 6 

remainder of the genomic DNA in the organism by at least 10 to 1 0 fold. 
However, the term "purified" also includes nucleic acids which have been 
purified from the remainder of the genomic DNA or from other sequences in a 
library or other environment by at least one order of magnitude, preferably two 
or three orders of magnitude, and more preferably four or five orders of 
magnitude. 

[075] "Recombinant" means that the nucleic acid is present in the cell with 
"backbone" nucleic acid, wherein the nucleic acid is not present with 
"backbone" nucleic acid in its natural environment. "Recombinant" can also 
be defined to mean that the nucleic acid is adjacent to "backbone" nucleic acid 
to which it is not adjacent in its natural environment. "Enriched" nucleic acids 
represent 5% or more of the number of nucleic acid inserts in a population of 
nucleic acid backbone molecules. "Backbone" molecules include nucleic 
acids such as expression vectors, self-replicating nucleic acids, viruses, 
integrating nucleic acids, and other vectors or nucleic acids used to maintain 
or manipulate a nucleic acid of interest. Preferably, the enriched nucleic acids 
represent 15% or more, more preferably 50% or more, and most preferably 
90% or more, of the number of nucleic acid inserts in the population of 
recombinant backbone molecules. 

[076] "Recombinant" polypeptides or proteins refer to polypeptides or 
proteins produced by recombinant DNA techniques, i.e. produced from cells 
transformed by an exogenous DNA construct encoding the desired 
polypeptide or protein. "Synthetic" polypeptides or proteins are those 
prepared by chemical synthesis. 

[077] The term "gene" means the segment of DNA involved in producing a 
polypeptide chain; it includes regions preceding and following the coding 
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region (leader and trailer) as well as, where applicable, intervening regions 
(introns) between individual coding segments (exons). 
[078] The terms "gene locus, "gene cluster," and "biosynthetic locus" refer to 
a group of genes or variants thereof involved in the biosynthesis of the 
polyketide of Formula 2a. Genetic modification of gene locus, gene cluster or 
biosynthetic locus refers to any genetic recombinant techniques known in the 
art including mutagenesis, inactivation, or replacement of nucleic acids that 
can be applied to generate variants of the compounds of Formula 2a. Genetic 
modification of gene locus, gene cluster or biosynthetic locus refers to any 
genetic recombinant techniques known in the art including mutagenesis, 
inactivation, or replacement of nucleic acids that can be applied to generate 
genetic variants of compounds of Formula I. 

[079] A DNA or nucleotide "coding sequence" or "sequence encoding" a 
particular polypeptide or protein, is a DNA sequence which is transcribed and 
translated into a polypeptide or protein when placed under the control of 
appropriate regulatory sequences. 

[080] "Oligonucleotide" refers to a nucleic acid, generally of at least 10, 
preferably 15 and more preferably at least 20 nucleotides, preferably no more 
than 100 nucleotides, that are hybridizable to a genomic DNA molecule, a 
cDNA molecule, or an mRNA molecule encoding a gene, mRNA, cDNA or 
other nucleic acid of interest. 

[081] A promoter sequence is "operably linked to" a coding sequence 
recognized by RNA polymerase which initiates transcription at the promoter 
and transcribes the coding sequence into mRNA. 
[082] "Digestion" of DNA refers to enzymatic cleavage of the DNA with a 
restriction enzyme that acts only at certain sequences in the DNA. The 
various restriction enzymes used herein are commercially available and their 
reaction conditions, cofactdrs and other requirements were used as would be 
known to the ordinary skilled artisan. For analytical purposes, typically 1 pg of 
plasmid or DNA fragment is used with about 2 units of enzyme in about 20 pi 
of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 pg of DNA are digested with 20 to 250 units of 
enzyme in a larger volume. Appropriate buffers and substrate amounts for 
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particular enzymes are specified by the manufacturer. Incubation times of 
about 1 hour at 37°C are ordinarily used, but may vary in accordance with the 
supplier's instructions. After digestion, gel electrophoresis may be performed 
to isolate the desired fragment. 

[083] As used herein and as known in the art, the term "identity" is the 
relationship between two or more polynucleotide sequences, as determined 
by comparing the sequences. Identity also means the degree of sequence 
relatedness between polynucleotide sequences, as determined by the match 
between strings of such sequences. Identity can be readily calculated (see, 
e.g., Computation Molecular Biology, Lesk, A.M., eds., Oxford University 
Press, New York (1998), and Biocomputing: Informatics and Genome 
Projects, Smith, D.W., ed., Academic Press, New York (1993), both of which 
are incorporated by reference herein). While there exist a number of methods 
to measure identity between two polynucleotide sequences, the term is well 
known to skilled artisans (see, e.g., Sequence Analysis in Molecular Biology, 
von Heinje, G., Academic Press (1987); and Sequence Analysis Primer, 
Gribskov., M. and Devereux, J., eds., M. Stockton Press, New York (1991)). 
Methods commonly employed to determine identity between sequences 
include, for example, those disclosed in Carillo, H., and Lipman, D., SIAM J. 
Applied Math. (1988) 48:1073. "Substantially identical," as used herein, 
means there is a very high degree of homology (preferably 100% sequence 
identity) between subject polynucleotide sequences. However, 
polynucleotides having greater than 90%, or 95% sequence identity may be 
used in the present invention, and thus sequence variations that might be 
expected due to genetic mutation, strain polymorphism, or evolutionary 
divergence can be tolerated. 

[084] The biosynthetic locus for the production of the Compound 2(a) spans 
approximately 176,000 base pairs of DNA and encodes 38 proteins. More 
than 10 kilobases of DNA sequence were analyzed on each side of the locus 
and these regions were found to contain primary metabolic genes. 
The order and relative position of the 38 open reading frames representing the 
proteins of the biosynthetic locus for Compound 2(a) are provided in Figure 1 . 
Referring to Figure 1 , the genes involved in the biosynthesis of Compound 
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2(a) are contained within two contiguous nucleotide sequences (SEQ ID NOS: 

1 and 1 8). The contiguous nucleotide sequences are arranged such that, as 
found within the compound 2(a) biosynthetic locus, the 3' end of the 1 1740 
base pairs of DNA of contig 1 (SEQ ID NO: 1) is found adjacent to the 5' end 
of the 164,051 base pairs of DNA of contig 2 (SEQ ID NO: 18). 

[085] The nucleotide sequence and polypeptide sequences relating to the 
locus of compound 2(a) are provided in the sequence listing filed together with 
and forming part of this application. SEQ ID NO: 1 is the 1 1740 contiguous 
base pairs of contig 1 comprising eight open reading frames, namely ORF 1 to 
ORF 8 listed in SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15 and 17 respectively. The 
gene product of ORF 1 (SEQ ID NO: 2) is the 719 amino acids deduced from 
the nucleic acid sequence of SEQ ID NO: 3 which is drawn from residues 418 
to 2577 (sense strand) of contig 1 (SEQ ID NO: 1). The gene product of ORF 

2 (SEQ ID NO: 4) is the 253 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 5 which is drawn from residues 3006 to 3767 (sense 
strand) of contig 1 (SEQ ID NO: 1). The gene product of ORF 3 (SEQ ID NO: 
6) is the 956 amino acids deduced from the nucleic acid sequence of SEQ ID 
NO: 7 which is drawn from residues 4016 to 6886 (sense strand) of contig 1 
(SEQ ID NO: 1). The gene product of ORF 4 (SEQ ID NO: 8) is the 201 
amino acids deduced from the nucleic acid sequence of SEQ ID NO: 9 which 
is drawn from residues 7581 to 6976 (antisense strand) of contig 1 (SEQ ID 
NO: 1). The gene product of ORF 5 (SEQ ID NO: 10) is the 416 amino acids 
deduced from the nucleic acid sequence of SEQ ID NO: 1 1 which is drawn 
from residues 8848 to 7598 (antisense strand) of contig 1 (SEQ ID NO: 1). 
The gene product of ORF 6 (SEQ ID NO: 12) is the 186 amino acids deduced 
from the nucleic acid sequence of SEQ ID NO: 13 which is drawn from 
residues 9053 to 9613 (sense strand) of contig 1 (SEQ ID NO: 1). The gene 
product of ORF 7 (SEQ ID NO: 14) is the 163 amino acids deduced from the 
nucleic acid sequence of SEQ ID NO: 15 which is drawn from residues 9682 
to 10173 (sense strand) of contig 1 (SEQ ID NO: 1). The gene product of 
ORF 8 (SEQ ID NO: 16) is the 514 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 17 which is drawn from residues 10170 to 1 1714 
(sense strand) of contig 1 (SEQ ID NO: 1). 
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[086] SEQ ID NO: 18 is the 164,051 contiguous base pairs of contig 2 
comprising 30 ORFs, namely ORF 9 to ORF 38 listed in SEQ ID NOS: 20, 22, 
24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 76 and 78 respectively. The gene product of ORF 9 
(SEQ ID NO: 19) is the 367 amino acids deduced from the nucleic acids 
sequence of SEQ ID NO: 20 which is drawn from residues 1 109 to 6 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 10 
(SEQ ID NO: 21) is the 8147 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 22 which is drawn from residues 1375 to 25818 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 1 1 
(SEQ ID NO: 23) is the 3428 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 24 which is drawn from residues 25902 to 36188 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 12 
(SEQ ID NO: 25) is the 6751 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 26 which is drawn from residues 36213 to 56468 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 13 
(SEQ ID NO: 27) is the 1657 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 28 which is drawn from residues 56600 to 61573 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 14 
(SEQ ID NO: 29) is the 5207 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 30 which is drawn from residues 61852 to 77475 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 15 
(SEQ ID NO: 31) is the 5432 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 32 which is drawn from residues 77606 to 93904 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 16 
(SEQ ID NO: 33) is the 3227 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 34 which is drawn from residues 94057 to 103740 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 17 
(SEQ ID NO: 35) is the 7510 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 36 which is drawn from residues 103789 to 126321 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 18 
(SEQ iD NO: 37) is the 3872 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 38 which is drawn from residues 126389 to 138007 
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(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 19 
(SEQ ID NO: 39) is the 338 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 40 which is drawn from residues 139079 to 138063 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 20 
(SEQ ID NO: 41) is the 283 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 42 which is drawn from residues 1401 17 to 139266 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 21 
(SEQ ID NO: 43) is the 329 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 44 which is drawn from residues 141 103 to 1401 14 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 22 
(SEQ ID NO: 45) is the 317 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 46 which is drawn from residues 141483 to 142436 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 23 
(SEQ ID NO: 47) is the 204 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 48 which is drawn from residues 142440 to 143054 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 24 
(SEQ ID NO: 49) is the 328 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 50 which is drawn from residues 143133 to 1441 19 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 25 
(SEQ ID NO: 51) is the 328 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 52 which is drawn from residues 1441 16 to 145102 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 26 
(SEQ ID NO: 53) is the 214 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 54 which is drawn from residues 145099 to 145743 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 27 
(SEQ ID NO: 55) is the 470 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 56 which is drawn from residues 145818 to 147230 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 28 
(SEQ ID NO: 57) is the 553 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 58 which is drawn from residues 148967 to 147306 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 29 
(SEQ ID NO: 59) is the 231 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 60 which is drawn from residues 149871 to 149176 
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(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 30 
(SEQ ID NO: 61) is the 306 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 62 which is drawn from residues 150788 to 149868 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 31 
(SEQ ID NO: 63) is the 998 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 64 which is drawn from residues 153765 to 150769 
(antisense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 32 
(SEQ ID NO: 65) is the 518 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 66 which is drawn from residues 154485 to 156041 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 33 
(SEQ ID NO: 67) is the 329 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 68 which is drawn from residues 156075 to 157064 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 34 
(SEQ ID NO: 69) is the 521 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 70 which is drawn from residues 157308 to 158873 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 35 
(SEQ ID NO: 71) is the 410 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 72 which is drawn from residues 158970 to 160202 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 36 
(SEQ ID NO: 73) is the 506 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 74 which is drawn from residues 160199 to 161719 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 37 
(SEQ ID NO: 75) is the 217 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 76 which is drawn from residues 161924 to 162577 
(sense strand) of contig 2 (SEQ ID NO: 18). The gene product of ORF 38 
(SEQ ID NO: 77) is the 442 amino acids deduced from the nucleic acid 
sequence of SEQ ID NO: 78 which is drawn from residues 162723 to 164051 
(sense strand) of contig 2 (SEQ ID NO: 18). 

[087] Some open reading frames listed herein initiate with non-standard 
initiation codons {e.g. GTG - Valine or CTG - Leucine) rather than the 
standard initiation codon ATG, namely ORFs 3, 5, 6, 9, 11, 13, 21, 22, 23, 24, 
27, 34, 36 and 37 (SEQ ID NOS: 7, 1 1 ,13, 20, 24, 28, 44, 46, 48, 50, 56, 70, 
74 and 76). All ORFs are listed with the appropriate M, V or L amino acids at 
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the amino-terminal position to indicate the specificity of the first codon of the 
ORF. It is expected, however, that in all cases the biosynthesized protein will 
contain a methionine residue, and more specifically a formylmethionine 
residue, at the amino terminal position, in keeping with the widely accepted 
principle that protein synthesis in bacteria initiates with methionine 
(formylmethionine) even when the encoding gene specifies a non-standard 
initiation codon (e.g. Stryer, Biochemistry 3 rd edition, 1998, W.H. Freeman and 
Co., New York, pp. 752-754). 

[088] Five E. coli DH1 OB deposits, each harbouring a cosmid clone of a 
partial biosynthetic locus for compound 2(a) from Streptomyces aizunensis 
(NRRL B-1 1277) and together spanning the full locus were deposited with the 
International Depositary Authority of Canada, Bureau of Microbiology, Health 
Canada, 1015 Arlington Street, Winnipeg, Manitoba, Canada R3E 3R2 on 
February 25, 2003 and were assigned deposit accession numbers IDAC 
250203-01, IDAC 250203-02, IDAC 250203-03, IDAC 250203-04 and IDAC 
250203-05 respectively. The sequence of the polynucleotides comprised in 
the deposited strains, as well as the amino acid sequence of any polypeptide 
encoded thereby are controlling in the event of any conflict with any 
description of sequences herein. 

[089] A natural mutant of Streptomyces aizunensis (NRRL B-1 1277), referred 
to as strain [C03]023 producing Compound 2(a) and used to produce the 
compounds of Formula I and Formula II was deposited with the International 
Depositary Authority of Canada, Bureau of Microbiology, Health Canada, 
1015 Arlington Street, Winnipeg, Manitoba, Canada R3E 3R2 on August 7, 
2003 and was assigned deposit accession number IDAC 070803-1. 
[090] Another mutant of Streptomyces aizunensis (NRRL B-1 1277), referred 
to as strain [C03U03]023 producing Compound 2(a) and used to produce the 
compounds of Formula I and Formula II was deposited with the International 
Depositary Authority of Canada, Bureau of Microbiology, Health Canada, 
1015 Arlington Street, Winnipeg, Manitoba, Canada R3E 3R2 on December 
23, 2003 and was assigned deposit accession number IDAC 231203-02. 
[091] The deposited cosmids and strains [C03]023 and [C03U03]023 (the 
deposited stains) have been made under the terms of the Budapest Treaty on 
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the International Recognition of the Deposit of Micro-organisms for Purposes 
of Patent Procedure. The deposited strains will be irrevocably and without 
restriction or condition released to the public upon the issuance of a patent. 
The deposited strains are provided merely as convenience to those skilled in 
the art and are not an admission that a deposit is required for enablement. A 
license may be required to make, use or sell the deposited strains, and 
compounds derived there from, and no such license is hereby granted. 
[092] The order and relative position of the 38 open reading frames 
representing the proteins of the biosynthetic locus for compound 2(a) 
(compound 2(a) ORFs) are illustrated schematically in Figure 1. The top line 
in Figure 1 provides a scale in base pairs. The gray bars depict the two DNA 
contigs that cover the compound 2(a) locus. The empty arrows represent the 
38 open reading frames of the compound 2(a) biosynthetic locus. The black 
arrows represent the five deposited cosmid clones covering the entire 
compound 2(a) locus. 

[093] One aspect of the present invention is an isolated, purified, or enriched 
nucleic acid comprising one of the sequences of SEQ ID NOS: 3, 5, 7, 9, 1 1 , 
13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 
54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, the sequences 
complementary thereto, or a fragment comprising at least 100, 200, 300, 400, 
500, 600, 700, 800 or more consecutive bases of one of the sequences of 
SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 
40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 or 
the sequences complementary thereto. The isolated, purified or enriched 
nucleic acids may comprise DNA, including cDNA, genomic DNA, and 
synthetic DNA. The DNA may be double stranded or single stranded, and if 
single stranded may be the coding (sense) or non-coding (anti-sense) strand. 
Alternatively, the isolated, purified or enriched nucleic acids may comprise 
RNA. 

[094] As discussed in more detail below, the isolated, purified or enriched 
nucleic acids of one of SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15, 17, 20, 22, 24, 26, 
28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 
68, 70, 72, 74, 76, 78 may be used to prepare one of the polypeptides of SEQ 
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ID NOS: 2, 4, 6, 8, 10, 12, .14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 
41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 
respectively, or fragments comprising at least 50, 75, 1 00, 200, 300, 500 or 
more consecutive amino acids of one of the polypeptides of SEQ ID NO: 2, 4, 
6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 
49, 51, 53, 55, 5 7, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77. 
[095] Accordingly, another aspect of the present invention is an isolated, 
purified or enriched nucleic acid which encodes one of the polypeptides of 
SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 
39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 or 
fragments comprising at least 50, 75, 1 00, 1 50, 200, 300 or more consecutive 
amino acids of one of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 
16, 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 
57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77. The coding sequences of these 
nucleic acids may be identical to one of the coding sequences of one of the 
nucleic acids of SEQ ID NOS: 3, 5, 7, 9, 1 1, 13, 15, 17, 20, 22, 24, 26, 28, 30, 
32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 
72, 74, 76, 78 or a fragment thereof, or may be different coding sequences 
which encode one of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 
16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 
57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 or fragments comprising at least 50, 
75, 100, 150, 200, 300 consecutive amino acids of one of the polypeptides of 
SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 
39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 as 
a result of the redundancy or degeneracy of the genetic code. The genetic 
code is well known to those of skill in the art and can be obtained, for 
example, from Stryer, Biochemistry, 3 rd edition, W. H. Freeman & Co., New 
York. 

[096] The isolated, purified or enriched nucleic acid which encodes one of 
the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 
29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 
69, 71, 73, 75, 77 may include, but is not limited to: (1) only the coding 
sequences of one of SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15, 17, 20, 22, 24, 26, 



46 



3004-9US 



28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 
68, 70, 72, 74, 76, 78; (2) the coding sequences of SEQ ID NOS: 3, 5, 7, 9, 
11,13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 and additional coding 
sequences, such as leader sequences or proprotein; and (3) the coding 
sequences of SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15, 17, 20, 22, 24, 26, 28, 30, 
32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 
72, 74, 76, 78 and non-coding sequences, such as non-coding sequences 5' 
and/or 3' of the coding sequence. Thus, as used herein, the term 
"polynucleotide encoding a polypeptide" encompasses a polynucleotide that 
includes only coding sequence for the polypeptide as well as a polynucleotide 
that includes additional coding and/or non-coding sequence. 
[097] The invention relates to polynucleotides based on SEQ ID NOS: 3, 5, 
7, 9, 1 1, 13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 
50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 but having 
polynucleotide changes that are "silent", for example changes which do not 
alter the amino acid sequence encoded by the polynucleotides of SEQ ID 
NOS: 3, 5, 7, 9, 1 1 , 1 3, 15,17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 
44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78. The 
invention also relates to polynucleotides which have nucleotide changes 
which result in amino acid substitutions, additions, deletions, fusions and 
truncations of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 
21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 
61 , 63, 65, 67, 69, 71 , 73, 75, 77. Such nucleotide changes may be 
introduced using techniques such as site directed mutagenesis, random 
chemical mutagenesis, exonuclease III deletion, and other recombinant DNA . 
techniques. 

[098] The isolated, purified or enriched nucleic acids of SEQ ID NOS: 3, 5, 7, 
9, 1 1, 13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, the sequences 
complementary thereto, or a fragment comprising at least 10, 15, 20, 25, 30, 
35, 40, 50, 75, 100, 150, 200, 300, 400 or 500 consecutive bases of one of 
the sequence of SEQ ID NOS: 3, 5, 7, 9, 1 1 , 13, 15, 17, 20, 22, 24, 26, 28, 30, 
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32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 
72, 74, 76, 78, or the sequences complementary thereto may be used as 
probes to identify and isolate DNAs encoding the polypeptides of SEQ ID 
NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 
43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 
respectively. In such procedures, a genomic DNA library is constructed from 
a sample microorganism or a sample containing a microorganism capable of 
producing a polyketide. The genomic DNA library is then contacted with a 
probe comprising a coding sequence or a fragment of the coding sequence, 
encoding one of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 
19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 
59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, or a fragment thereof under conditions 
which permit the probe to specifically hybridize to sequences complementary 
thereto. In a preferred embodiment, the probe is an oligonucleotide of about 
10 to about 30 nucleotides in length designed based on a nucleic acid of SEQ 
ID NOS: 3, 5, 7, 9, 1 1 , 13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 
42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76 or 78. 
Genomic DNA clones which hybridize to the probe are then detected and 
isolated. Procedures for preparing and identifying DNA clones of interest are 
disclosed in Ausubel etal., Current Protocols in Molecular Biology, John Wiley 
503 Sons, Inc. 1997; and Sambrook etal., Molecular Cloning: A Laboratory 
Manual 2d Ed., Cold Spring Harbor Laboratory Press, 1989. In another 
embodiment, the probe is a restriction fragment or a PCR amplified nucleic 
acid derived from SEQ ID NOS: 3, 5, 7, 9, 11, 13, 15, 17, 20, 22, 24, 26, 28, 
30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 
70,72,74, 76, 78. 

[099] The isolated, purified or enriched nucleic acids of SEQ ID NOS: 3, 5, 7, 
9, 11, 13, 15, 17, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, the sequences 
complementary thereto, or a fragment comprising at least 10, 15, 20, 25, 30, 
35, 40, 50, 75, 100, 150, 200, 300, 400 or 500 consecutive bases of one of 
the sequences of SEQ ID NOS: 3, 5, 7, 9, 1 1 , 1 3, 1 5, 17, 20, 22, 24, 26, 28, 
30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 
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70, 72, 74, 76, 78, or the sequences complementary, thereto may be used as 
probes to identify and isolate related nucleic acids. In some embodiments, 
the related nucleic acids may be genomic DNAs (or cDNAs) from potential 
polyketide producers. In such procedures, a nucleic acid sample containing 
nucleic acids from a potential polyketide producer is contacted with the probe 
under conditions that permit the probe to specifically hybridize to related 
sequences. The nucleic acid sample may be a genomic DNA (or cDNA) 
library from the potential polyketide-producer. Hybridization of the probe to 
nucleic acids is then detected using any of the methods described above. 
[0100] Hybridization may be carried out under conditions of low stringency, 
moderate stringency or high stringency. As an example of nucleic acid 
hybridization, a polymer membrane containing immobilized denatured nucleic 
acids is first prehybridized for 30 minutes at 45 °C in a solution consisting of 
0;9 M NaCI, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM Na 2 EDTA, 0.5% SDS, 10X 
Denhardt's, and 0.5 mg/ml polyriboadenylic acid. Approximately 2 x 10 7 cpm 
(specific activity 4-9 x 10 8 cpm/ug) of 32 P end-labeled oligonucleotide probe 
are then added to the solution. After 12-16 hours of incubation, the 
membrane is washed for 30 minutes at room temperature in 1X SET (150 mM 
NaCI, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na 2 EDTA) containing 0.5% 
SDS, followed by a 30 minute wash in fresh 1X SET at Tm-10°C for the 
oligonucleotide probe where Tm is the melting temperature. The membrane 
is then exposed to autoradiographic film for detection of hybridization signals. 
[0101] By varying the stringency of the hybridization conditions used to identify 
nucleic acids, such as genomic DNAs or cDNAs, which hybridize to the 
detectable probe, nucleic acids having different levels of homology to the 
probe can be identified and isolated. Stringency may be varied by conducting 
the hybridization at varying temperatures below the melting temperatures of 
the probes. The melting temperature of the probe may be calculated using 
the following formulas: 

[01 02] For oligonucleotide probes between 14 and 70 nucleotides in length the 
melting temperature (Tm) in degrees Celcius may be calculated using the 
formula: Tm=81.5+16.6(log [Na+]) + 0.41 (fraction G+C)-(600/N) where N is 
the length of the oligonucleotide. 
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[01 03] If the hybridization is carried out in a solution containing formamide, the 
melting temperature may be calculated using the equation Tm=81 .5+16.6(log 
[Na +]) + 0.41 (fraction G + C)-(0.63% formamide)-(600/N) where N is the 
length of the probe. 

[0104]Prehybridization may be carried out in 6X SSC, 5X Denhardt's reagent, 
0.5% SDS, 0.1 mg/ml denatured fragmented salmon sperm DNA or 6X SSC, 
5X Denhardt's reagent, 0.5% SDS, 0.1 mg/ml denatured fragmented salmon 
sperm DNA, 50% formamide. The composition of the SSC and Denhardfs 
solutions are listed in Sambrook et al., supra. 
[01 05] Hybridization is conducted by adding the detectable probe to the 
hybridization solutions listed above. Where the probe comprises double 
stranded DNA, it is denatured by incubating at elevated temperatures and 
quickly cooling before addition to the hybridization solution. It may also be 
desirable to similarly denature single stranded probes to eliminate or diminish 
formation of secondary structures or oligomerization. The filter is contacted 
with the hybridization solution for a sufficient period of time to allow the probe 
to hybridize to cDNAs or genomic DNAs containing sequences 
complementary thereto or homologous thereto. For probes over 200 
nucleotides in length, the hybridization may be carried out at 15-25 °C below 
the Tm. For shorter probes, such as oligonucleotide probes, the hybridization 
may be conducted at 5-10 °C below the Tm. Preferably, the hybridization is 
conducted in 6X SSC, for shorter probes. Preferably, the hybridization is 
conducted in 50% formamide containing solutions, for longer probes. All the 
foregoing hybridizations would be considered to be examples of hybridization 
performed under conditions of high stringency. 

[01 06] Following hybridization, the filter is washed for at least 15 minutes in 2X 
SSC, 0.1% SDS at room temperature or higher, depending on the desired 
stringency. The filter is then washed with 0.1X SSC, 0.5% SDS at room 
temperature (again) for 30 minutes to 1 hour. Nucleic acids which have 
hybridized to the probe are identified by conventional autoradiography and 
non-radioactive detection methods. 

[01 07] The above procedure may be modified to identify nucleic acids having 
decreasing levels of homology to the probe sequence. For example, to obtain 
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nucleic acids of decreasing homology to the detectable probe, less stringent 
conditions may be used. For example, the hybridization temperature may be 
decreased in increments of. 5 °C from 68 °C to 42 °C in a hybridization buffer 
having a Na+ concentration of approximately 1M. Following hybridization, the 
filter may be washed with 2X SSC, 0.5% SDS at the temperature of 
hybridization. These conditions are considered to be "moderate stringency" 
conditions above 50°C and "low stringency" conditions below 50°C. A specific 
example of "moderate stringency" hybridization conditions is when the above 
hybridization is conducted at 55°C. A specific example of "low stringency" 
hybridization conditions is when the above hybridization is conducted at 45°C. 
[0108] Alternatively, the hybridization may be carried out in buffers, such as 
6X SSC, containing formamide at a temperature of 42 °C. In this case, the 
concentration of formamide in the hybridization buffer may be reduced in 5% 
increments from 50% to 0% to identify clones having decreasing levels of 
homology to the probe. Following hybridization, the filter may be washed with 
6X SSC, 0.5% SDS at 50 °C. These conditions are considered to be 
"moderate stringency" conditions above 25% formamide and "low stringency" 
conditions below 25% formamide. A specific example of "moderate 
stringency", hybridization conditions is when the above hybridization is 
conducted at 30% formamide. A specific example of "low stringency" 
hybridization conditions is when the above hybridization is conducted at 10% 
formamide. Nucleic acids which have hybridized to the probe are identified by 
conventional autoradiography and non-radioactive detection methods. 
[0109]The preceding methods may be used to isolate nucleic acids having at 
least 97%, at least 95%, at least 90%, at least 85%, at least 80%, or at least 
70% sequence identity to a nucleic acid sequence selected from the group 
consisting of the sequences of SEQ ID NOS: 3, 5, 7, 9, 1 1, 13, 15, 17, 20, 22, 
24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 76, 78, fragments comprising at least 10, 15, 20, 25, 
30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 consecutive bases thereof, 
and the sequences complementary thereto. The isolated nucleic acid may 
have a coding sequence that is a naturally occurring allelic variant of one of 
the coding sequences described herein. Such allelic variant may have a 
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substitution, deletion or addition of one or more nucleotides when compared 
to the nucleic acids of SEQ ID NOS: 3, 5, 7, 9, 1 1 , 13, 15, 17, 20, 22, 24, 26, 

28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 

68, 70, 72, 74, 76, 78, or the sequences complementary thereto. 

[01 10] Additionally, the above procedures may be used to isolate nucleic acids 
which encode polypeptides having at least 99%, at least 95%, at least 90%, at 
least 85%, at least 80%, or at least 70% identity to a polypeptide having the 
sequence of one of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 

29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 

69, 71, 73, 75, 77 or fragments comprising at least 50, 75, 100, 150, 200, 300 
consecutive amino acids thereof as determined using the BLASTP version 
2.2.2 algorithm with default parameters. 

[01 11] Another aspect of the present invention is an isolated or purified 
polypeptide comprising the sequence of one of SEQ ID NOS: 2, 4, 6, 8, 10, 
12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 
53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 or fragments comprising at 
least 50, 75, 100, 150, 200 or 300 consecutive amino acids thereof. As 
discussed herein, such polypeptides may be obtained by inserting a nucleic 
acid encoding the polypeptide into a vector such that the coding sequence is 
operably linked to a sequence capable of driving the expression of the 
encoded polypeptide in a suitable host cell. For example, the expression 
vector may comprise a promoter, a ribosome binding site for translation 
initiation and a transcription terminator. The vector may also include 
appropriate sequences for modulating expression levels, an origin of 
replication and a selectable marker. 

[01 12] Promoters suitable for expressing the polypeptide or fragment thereof 
in bacteria include the E.coli lac or trp promoters, the lad promoter, the lacZ 
promoter, the T3 promoter, the T7 promoter, the gpt promoter, the lambda P R 
promoter, the lambda P L promoter, promoters from operons encoding 
glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), and the acid 
phosphatase promoter. Fungal promoters include the a factor promoter. 
Eukaryotic promoters include the CMV immediate early promoter, the HSV 
thymidine kinase promoter, heat shock promoters, the early and late SV40 
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promoter, LTRs from retroviruses, and the mouse metallothionein-l promoter. 
Other promoters known to control expression of genes in prokaryotic or 
eukaryotic cells or their viruses may also be used. 
[01 13] Mammalian expression vectors may also comprise an origin of 
replication, any necessary ribosome binding sites, a polyadenylation site, 
splice donors and acceptor sites, transcriptional termination sequences, and 
5' flanking nontranscribed sequences. In some embodiments, DNA 
sequences derived from the SV40 splice and polyadenylation sites may be 
used to provide the required nontranscribed genetic elements. 
[0114] Vectors for expressing the polypeptide or fragment thereof in 
eukaryotic cells may also contain enhancers to increase expression levels. 
Enhancers are cis-acting elements of DNA, usually from about 10 to about 
300 bp in length that act on a promoter to increase its transcription. Examples 
include the SV40 enhancer on the late side of the replication origin bp 100 to 
270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on 
the late side of the replication origin, and the adenovirus enhancers. 
[01 15] In addition, the expression vectors preferably contain one or more 
selectable marker genes to permit selection of host cells containing the vector. 
Examples of selectable markers that may be used include genes encoding 
dihydrofplate reductase or genes conferring neomycin resistance for 
eukaryotic cell culture, genes conferring tetracycline or ampicillin resistance in 
E. coli, and the S. cerevisiae TRP1 gene. 
[0116] In some embodiments, the nucleic acid encoding one of the 
polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 
31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 
71 , 73, 75, 77 or fragments comprising at least 50, 75, 100, 150, 200 or 300 
consecutive amino acids thereof is assembled in appropriate phase with a 
leader sequence capable of directing secretion of the translated polypeptides 
or fragments thereof. Optionally, the nucleic acid can encode a fusion 
polypeptide in which one of the polypeptide of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 
14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 
55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 or fragments comprising at least 
5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids 
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thereof is fused to heterologous peptides or polypeptides, such as N-terminal 
identification peptides which impart desired characteristics such as increased 
stability or simplified purification or detection. 

[01 17] The appropriate DNA sequence may be inserted into the vector by a 
variety of procedures. In general, the DNA sequence is ligated to the desired 
position in the vector following digestion of the insert and the vector with 
appropriate restriction endonucleases. Alternatively, appropriate restriction 
enzyme sites can be engineered into a DNA sequence by PCR. A variety of 
cloning techniques are disclosed in Ausbel et ai Current Protocols in 
Molecular Biology, John Wiley 503 Sons, Inc. 1997 and Sambrook et a/., 
Molecular Cloning: A Laboratory Manual 2d Ed., Cold Spring Harbour 
Laboratory Press, 1989. Such procedures and others are deemed to be 
within the scope of those skilled in the art. 

[01 18] The vector may be, for example, in the form of a plasmid, a viral 
particle, or a phage. Other vectors include derivatives of chromosomal, 
nonchromosomal and synthetic DNA sequences, viruses, bacterial plasmids, 
phage DNA, baculovirus, yeast plasmids, vectors derived from combinations 
of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox 
virus, and pseudorabies. A variety of cloning and expression vectors for use 
with prokaryotic and eukaryotic hosts are described by Sambrook et al., 
Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, 
N.Y., (1989). 

[0119] Particular bacterial vectors which may be used include the 
commercially available plasmids comprising genetic elements of the well 
known cloning vector pBR322 (ATCC 37017), pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden), pGEM1 (Promega Biotec, Madison, Wl, USA) 
pQE70, pQE60, pQE-9 (Qiagen), pD10, phiX174, pBluescript™ II KS, pNH8A, 
pNH16a, pNH18A, pNH46A (Stratagene), ptrc99a, pKK223-3, pKK233-3, 
pDR540, pRIT5 (Pharmacia), pKK232-8 and pCM7. Particular eukaryotic 
vectors include pSV2CAT, pOG44, pXT1, pSG (Stratagene) pSVK3, pBPV, 
pMSG, and pSVL (Pharmacia). However, any other vector may be used as 
long as it is replicable and stable in the host cell. 
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[01 20] The host cell may be any of the host cells familiar to those skilled in the 
art, including prokaryotic cells or eukaryotic cells. As representative examples 
of appropriate hosts, there may be mentioned: bacteria cells, such as E co//, 
Streptomyces lividans, Streptomyces griseofuscus, Streptomyces 
ambofaciens, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, Bacillus, and 
Staphylococcus, fungal cells, such as yeast, insect cells such as Drosophila 
S2 and Spodoptera Sf9, animal cells such as CHO, COS or Bowes 
melanoma, and adenoviruses. The selection of an appropriate host is within 
the abilities of those skilled in the art. 

[01 21] The vector may be introduced into the host cells using any of a variety 
of techniques, including electroporation transformation, transfection, 
transduction, viral infection, gene guns, or Ti-mediated gene transfer. Where 
appropriate, the engineered host cells can be cultured in conventional nutrient 
media modified as appropriate for activating promoters, selecting 
transformants or amplifying the genes of the present invention. Following 
transformation of a suitable host strain and growth of the host strain to an 
appropriate cell density, the selected promoter may be induced by appropriate 
means (e.g., temperature shift or chemical induction) and the cells may be 
cultured for an additional period to allow them to produce the desired 
polypeptide or fragment thereof. 

[01 22] Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract is retained for further 
purification. Microbial cells employed for expression of proteins can be 
disrupted by any convenient method, including freeze-thaw cycling, 
sonication, mechanical disruption, or use of cell lysing agents. Such methods 
are well known to those skilled in the art. The expressed polypeptide or 
fragment thereof can be recovered and purified from recombinant cell cultures 
by methods including ammonium sulfate or ethanol precipitation, acid 
extraction, anion or cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography, affinity 
chromatography, hydroxylapatite chromatography and lectin chromatography. 
Protein refolding steps can be used, as necessary, in completing configuration 
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of the polypeptide. If desired, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 
[0123] Various mammalian cell culture systems can also be employed to 
express recombinant protein. Examples of mammalian expression systems 
include the COS-7 lines of monkey kidney fibroblasts (described by Gluzman, 
Cell, 23:175(1981)), and other cell lines capable of expressing proteins from a 
compatible vector, such as the C127, 3T3, CHO, HeLa and BHK cell lines. 
The constructs in host cells can be used in a conventional manner to produce 
the gene product encoded by the recombinant sequence. Polypeptides of the 
invention may or may not also include an initial methionine amino acid 
residue. 

[01 24] Alternatively, the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 
16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 
57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 or fragments comprising at least 50, 
75, 100, 150, 200 or 300 consecutive amino acids thereof can be synthetically 
produced by conventional peptide synthesizers. In other embodiments, 
fragments or portions of the polynucleotides may be employed for producing 
the corresponding full-length polypeptide by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length 
polypeptides. 

[01 25] Cell-free translation systems can also be employed to produce one of 
the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 
29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 
69, 71 , 73, 75, 77 or fragments comprising at least 50, 75, 1 00, 1 50, 200 or 
300 consecutive amino acids thereof using mRN As transcribed from a DNA 
construct comprising a promoter operably linked to a nucleic acid encoding 
the polypeptide or fragment thereof. In some embodiments, the DNA 
construct may be linearized prior to conducting an in vitro transcription 
reaction. The transcribed mRNA is then incubated with an appropriate cell- 
free translation extract, such as a rabbit reticulocyte extract, to produce the 
desired polypeptide or fragment thereof. 

[01 26] The present invention also relates to variants of the polypeptides of 
SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 
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39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 or 
fragments comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino 
acids thereof. The term "variant" includes derivatives or analogs of these 
polypeptides. In particular, the variants may differ in amino acid sequence 
from the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 
25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 
65, 67, 69, 71 , 73, 75, 77 by one or more substitutions, additions, deletions, 
fusions and truncations, which may be present in any combination. 
[01 27] The variants may be naturally occurring or created in vitro. In 
particular, such variants may be created using genetic engineering techniques 
such as site directed mutagenesis, random chemical mutagenesis, 
exonuclease III deletion procedures, and standard cloning techniques. 
Alternatively, such variants, fragments, analogs, or derivatives may be created 
using chemical synthesis or modification procedures. 
[01 28] Other methods of making variants are also familiar to those skilled in 
the art. These include procedures in which nucleic acid sequences obtained 
from natural isolates are modified to generate nucleic acids that encode 
polypeptides having characteristics which enhance their value in industrial or 
laboratory applications. In such procedures, a large number of variant 
sequences having one or more nucleotide differences with respect to the 
sequence obtained from the natural isolate are generated and characterized. 
Preferably, these nucleotide differences result in amino acid changes with 
respect to the polypeptides encoded by the nucleic acids from the natural 
isolates. 

[0129] For example, variants may be created using error prone PCR. In error 
prone PCR, DNA amplification is performed under conditions where the 
fidelity of the DNA polymerase is low, such that a high rate of point mutation is 
obtained along the entire length of the PCR product. Error prone PCR is 
described in Leung, D.W., etal., Technique, 1:11-15 (1989) and Caldwell, R. 
C. & Joyce G.F., PCR Methods Applic., 2:28-33 (1992). Variants may also be 
created using site directed mutagenesis to generate site-specific mutations in 
any cloned DNA segment of interest. Oligonucleotide mutagenesis is 
described in Reidhaar-Olson, J.F. & Sauer, R.T., etai, Science, 241:53-57 
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(1988). Variants may also be created using directed evolution strategies such 
as those described in US patent nos. 6,361 ,974 and 6,372,497. The variants 
of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 
27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 
67, 69, 71 , 73, 75 and 77 may be variants in which one or more of the amino 
acid residues of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 
19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 
59, 61 , 63, 65, 67, 69, 71 , 73, 75 or 77 are substituted with a conserved or 
non-conserved amino acid residue (preferably a conserved amino acid 
residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code. 

[01 30] Conservative substitutions are those that substitute a given amino acid 
in a polypeptide by another amino acid of like characteristics. Typically seen 
as conservative substitutions are the following replacements: replacements of 
an aliphatic amino acid such as Ala, Val, Leu and He with another aliphatic 
amino acid; replacement of a Ser with a Thr or vice versa; replacement of an 
acidic residue such as Asp or Glu with another acidic residue; replacement of 
a residue bearing an amide group, such as Asn or Gin, with another residue 
bearing an amide group; exchange of a basic residue such as Lys or Arg with 
another basic residue; and replacement of an aromatic residue such as Phe 
or Tyr with another aromatic residue. 

[01 31] Other variants are those in which one or more of the amino acid 
residues of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 
23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 
63, 65, 67, 69, 71, 73, 75, 77 include a substituent group. Still other variants 
are those in which the polypeptide is associated with another compound, such 
as a compound to increase the half-life of the polypeptide (for example, 
polyethylene glycol). Additional variants are those in which additional amino 
acids are fused to the polypeptide, such as leader sequence, a secretory 
sequence, a proprotein sequence or a sequence that facilitates purification, 
enrichment, or stabilization of the polypeptide. 

[01 32] In some embodiments, the fragments, derivatives and analogs retain 
the same biological function or activity as the polypeptides of SEQ ID NOS: 2, 
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4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 
47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77. In other 
embodiments, the fragment, derivative or analogue includes a fused 
heterologous sequence that facilitates purification, enrichment, detection, 
stabilization or secretion of the polypeptide that can be enzymatically cleaved, 
in whole or in part, away from the fragment, derivative or analogue. 
[0133] Another aspect of the present invention are polypeptides or fragments 
thereof which have at least 70%, at least 80%, at least 85%, at least 90%, or 
more than 95% identity to one of the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 
10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 
51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75 and 77 or a fragment 
comprising at least 50, 75, 100, 150, 200 or 300 consecutive amino acids 
thereof. It will be appreciated that amino acid "identity" includes conservative 
substitutions such as those described above. 
[01 34] The polypeptides or fragments having homology to one of the 
polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 
31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 
71 , 73, 75, 77 or a fragment comprising at least 50, 75, 1 00, 1 50, 200 or 300 
consecutive amino acids thereof may be obtained by isolating the nucleic 
acids encoding them using the techniques described above. 
[01 35] Alternatively, the homologous polypeptides or fragments may be 
obtained through biochemical enrichment or purification procedures. The 
sequence of potentially homologous polypeptides or fragments may be 
determined by proteolytic digestion, gel electrophoresis and/or 
microsequencing. The sequence of the prospective homologous polypeptide 
or fragment can be compared to one of the polypeptides of SEQ ID NOS: 2, 4, 
6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 
49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 or a fragment 
comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 
consecutive amino acids thereof. . 

[01 36] The polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21,23, 
25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 
65, 67, 69, 71 , 73, 75, 77 or fragments, derivatives or analogs thereof 
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comprising at least 40, 50, 75, 100, 150, 200 or 300 consecutive amino acids 
thereof invention may be used in a variety of applications. For example, the 
polypeptides or fragments, derivatives or analogs thereof may be used to 
catalyze biochemical reactions as described elsewhere in the specification. 
[0137]The polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 
25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 
65, 67, 69, 71, 73, 75, 77 or fragments, derivatives or analogues thereof 
comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 
consecutive amino acids thereof, may also be used to generate antibodies 
which bind specifically to the polypeptides or fragments, derivatives or 
analogues. The antibodies generated from SEQ ID NOS: 2, 4, 6, 8, 10, 12, 
14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 
55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 may be used to determine 
whether a biological sample contains Streptomyces aizunensis or a related 
microorganism. 

[01 38] In such procedures, a biological sample is contacted with an antibody 
capable of specifically binding to one of the polypeptides of SEQ ID NOS: 2, 
4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 
47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 or fragments 
comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 
consecutive amino acids thereof. The ability of the biological sample to bind 
to the antibody is then determined. For example, binding may be determined 
by labeling the antibody with a detectable label such as a fluorescent agent, 
an enzymatic label, or a radioisotope. Alternatively, binding of the antibody to 
the sample may be detected using a secondary antibody having such a 
detectable label thereon. A variety of assay protocols which may be used to 
detect the presence of a polyketide-producer or of Streptomyces aizunensis or 
of polypeptides related to SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 
25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 
65, 67, 69, 71 , 73, 75, 77 in a sample are familiar to those skilled in the art. 
Particular assays include ELISA assays, sandwich assays, 
radioimmunoassays, and Western Blots. Alternatively, antibodies generated 
from SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 
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37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 
may be used to determine whether a biological sample contains related 
polypeptides that may be involved in the biosynthesis of polyketides. 
[01 39] Polyclonal antibodies generated against the polypeptides of SEQ ID 
NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 
43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 or 
fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 
consecutive amino acids thereof can be obtained by direct injection of the 
polypeptides into an animal or by administering the polypeptides to an animal, 
preferably a nonhuman. The antibody so obtained will then bind the 
polypeptide itself. In this manner, even a sequence encoding only a fragment 
of the polypeptide can be used to generate antibodies that may bind to the 
whole native polypeptide. Such antibodies can then be used to isolate the 
polypeptide from cells expressing that polypeptide. 

[01 40] For preparation of monoclonal antibodies, any technique that provides 
antibodies produced by continuous cell line cultures can be used. Examples 
include the hybridoma technique (Kholer and Milstein, 1975, Nature, 256:495- 
497), the trioma technique, the human B-cell hybridoma technique (Kozbor et 
al., 1983, Immunology today 4:72), and the EBV-hybridoma technique (Cole, 
et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., 
pp. 77-96). 

[01 41] Techniques described for the production of single chain antibodies 
(U.S. Patent 4,946,778) can be adapted to produce single chain antibodies to 
the polypeptides of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 
29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 
69, 71 , 73, 75, 77 or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 
40, 50, 75, 100, or 1 50 consecutive amino acids thereof. Alternatively, 
transgenic mice may be used to express humanized antibodies to these 
polypeptides or fragments thereof. 

[0142] Antibodies generated against the polypeptides of SEQ ID NOS: 2, 4, 6, 
8, 10, 12, 14, 16, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 
51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77 or fragments comprising 
at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino 
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acids thereof may be used in screening for similar polypeptides from a sample 
containing organisms or cell-free extracts thereof. In such techniques, 
polypeptides from the sample are contacted with the antibodies and those 
polypeptides which specifically bind the antibody are detected. Any of the 
procedures described above may be used to detect antibody binding. One 
such screening assay is described in "Methods for measuring Cellulase 
Activities", Methods in Enzymology, Vol 160, pp. 87-116. 
[0143] In order to identify the function of the genes in the compound 2(a) 
locus, ORFs 1 to 38 were compared, using the BLASTP version 2.2.1 
algorithm with the default parameters, to sequences in the National Center for 
Biotechnology Information (NCBI) nonredundant protein database and the 
DECIPHER® database of microbial genes, pathways and natural products 
(Ecopia Biosciences Inc. St.-Laurent, QC, Canada). 
[01 44] The accession numbers of the top GenBank hits of this Blast analysis 
are presented in Table 1 along with the corresponding E values. The E value 
relates the expected number of chance alignments with an alignment score at 
least equal to the observed alignment score. An E value of 0.00 indicates a 
perfect homolog. The E values are calculated as described in Altschul et al. J. 
Mol. Biol., 215, 403-410 (1990). The E value assists in the determination of 
whether two sequences display sufficient similarity to justify an inference of 
homology. 



62 



proposed function of GenBank match 


helicase, Streptomyces coelicolor | 


helicase, Corynebacterium efficiens 


helicase, Corynebacterium glutamicum 


thioesterase,Streptomyces avermitilis | 


Piml thioesterase, Streptomyces natalensis j 


NysE thioesterase, Streptomyces noursei 


transcriptional activator.Streptomyces 
venezuelae 


NbmM regulator.Streptomyces narbonensis j 


response regulator.Streptomyces avermitilis. 


response regulator.Streptomyces coelicolor | 


response regulator.Streptomyces reticuli j 


regulator.Xahthomonas axonopodis 


kinase, Streptomyces coelicolor I 


kinase.Streptomyces reticuli 


kinase.Streptomyces coelicolor 


membrane protein.Streptomyces coelicolor 


hypothetical protein.Mycobacterium 
tuberculosis 


hypothetical protein.Thermobifida fusca 


hypothetical protein.Xanthomonas 
axonopodis 


hypothetical protein.Mycobacterium 
tuberculosis 


hypothetical protein.Thermobifida fusca | 


hypothetical protein, Halobacterium sp. 


CalG3 glycosyltransferase.Micromonospora 
echinospora 


glycosyltransferase.Streptomyces olivaceus | 


glycosyl transferase.Streptomyces nogalater 


% similarity 


582/705 (82.55%) 


407/700 (58.14%) 


412/701 (58.77%) 


185/243 (76.13%) 


180/247 (72.87%) 


173/244 (70.9%) 


472/959 (49.22%) 


468/957 (48.9%) 


451/965 (46.74%) 


140/204 (68.63%) 


138/202 (68.32%) 


132/201 (65.67%) 


163/312(52.24%) 


157/304 (51.64%) 


139/267 (52.06%) 


I 50/102(49.02%) 


97/162 (59.88%) 


98/177 (55.37%) 


58/107 (54.21%) 


210/494 (42.51%) 


216/516(41.86%) 


169/475 (35.58%) 


205/370 (55.41%) 


187/373 (50.13%) 


179/374 (47.86%) 


% identity 


556/705 (78.7%) 


340/700 (48.57%) 


334/701 (47.65%) 


142/243 (58.44%) 


145/247 (58.7%) 


135/244 (55.33%) 


336/959 (35.04%) 


331/957 (34.59%) 


339/965 (35.13%) 


106/204 (51.96%) 


100/202 (49.5%) 


96/201 (47.76%) 


116/312(37.18%) 


113/304 (37.17%) 


109/267 (40.82%) 


30/102 (29.41%) 


67/162(41.36%) 


66/177 (37.29%) 


35/107 (32.71%) 


145/494(29.35%) 


155/516(30.04%) 


107/475 (22.53%) 


155/370 (41.89%) 


150/373 (40.21%) 


138/374 (36.9%) 


probability 


1E-200 


1E-165 


1E-161 


2E-82 


3E-78 


2E-73 


1E-132 


1E-131 


1E-127 


8E-49 


8E-47 


1E-43 


7E-39 


2E-37 


6E-37 


0.002 


2E-27 


4E-24 


1E-08 


4E-41 


5E-39 


1E-07 


4E-69 


2E-60 


5E-54 


GenBank homology 


T351 89,71 9aa 


BAC17778.1,686aa 


CO 
CO 

to. 

cvi 

o 
o 

CO 
CL 

z 


BAB69315.1,255aa 


CAC20922.1,255aa 


AAF71 777.1, 251 aa 


CO 
CO 
00 
CVJ 
O) 

00 
00 
00 
CO 

O 


AAM88362.1,945aa 


BAA84600.1,949aa 


NP_629592.1,224aa 


CAA74720.1.217aa 


NP_642485.1,213aa 


NP_628447.1,428aa 


CAA74719.1,398aa 


CAC32293.1.404aa 


CO 
CO 
LO 
00 

CO 
LO 
00 
CVJ 
CO 

CL 

z 


CAB1 0923.1, 177aa 


2P_00059442.1.172aa 


NP_644099.1,158aa 


E70508,487aa 


ZP_00059443.1,554aa 


NP_280206.1.514aa 


AAM94798.1,376aa 


CAC16413.2,382aa 


AAF01811.1,390aa 


#aa 


O) 






CO 
LO 

cvj 






CO 
lO 
CD 






^ 

o 

CVJ 






CO 

T— 






CO 
00 

T— 


CO 
CO 






LO 






CO 
CO 






Family 








TESA 






REGD 

* 






RREB 






SPKK 






UNEW 


UNFI 






UNEX 






GTFA 






ORF 








CVJ 






CO 












LO 






CO 








00 






o> 











w 






















CO 






CO 


















































cm 










yspora 




s avermi 


myces 




myces 




tomyces 


tomyces 


myces 


tomyces 


tomyces 


omyces 


s avermi 


yspora 




s avermi 


omyces 


Dtomycei 


Dtomycei 


omyces 


omyces 


)myces 


iropol 




CD 


o 




o 




CL. 


Q. 


o 


CL 


Q. 




CD 






CD 






ru 










myc 


rept 




rept 




CO 


Stre 


rept 


Stre 


Stre 


Q. 

CD 


myc 


rop< 




myc 


ithase,Stref 


2 
55 


,Stn 


Q. 
CD 

55 


ithase.Stref 


trep 


CO 

sz 




o 

4— > 


55 




55 




CD 


CD 


■*-> 

CO 


cd" 


CD~ 


CO 


o 


CO 
-C 




o 


CD 
CO 


CD 
CO 


CO 


o 




CL 


hase, 




hase, 




CO 


CO 


hase, 


CO 


CO 


CD* 




f 1 




Q. 


m 


CO 


cd" 


cd" 


o 
CO 
CO 




CD 

L_ 
4— 

CO 






ntha 


ntha 


ntha 


ntha 


thas 


Stre 


Sac< 




Stre 


ynthi 


sz 
c 


thas 


thas 


ase 




ase, 


synt 




synt 




>* 

CO 
CD 


esy 


synt 


>*• 

CO . 
CD . 


>» 

CO 
CD 


syn 


ase, 


ase, 




ase, 


)olyketide syr 
sis 


ilyketide s 


ilyketide s 


syn 


'ketide syr 


syn 


synth 




synth 


etide 




etide 




yketid 


yketid 


etide 


yketid 


yketid 


ketide 


synth 


synth 




synth 


ketide 


ketide 


ide 


pinosa 


ide 


ilyk 




>lyk 




o 

Q_ CO 


o 

CL CO 


>lyk 


o 

o CO 


o .52 


olyl 


ide 


ide 


pinosa 


de 


o 

°-co 


u 

°-co 


oly 


CO 

2 co 


>» 

o 


olyketi 


olyketi 


lysl pc 


oursei 


lysl pc 


oursei 


SZ « 

E o 


imS2 | 
atalen 


lysl pc 
oursei 


inphl | 
odosu 


c i5 
£ co 


Ql q 
CO £ 

CO ~1 

>» o 


olyketi 


olyketi 


olyketi 


imS1c 
atalen 


mphC 
odosu 


mphC 
odosu 


lysC p 
oursei 


imS1p 
atalen 


lysB p< 
oursei 


Q. CO 


a. 




c 




c 


< c 


CL C 




< c 


CL C 


2 c 


Q. 


Q. 


CO 


Q. 


CL C 


< c 


< c 


Z c 


CL C 




















































s? 










Co 




3 s 




Co* 

0 s 


Co 




Co 








Co* 


Co* 




Cp 


3 s 

0 s - 


Cp 




0 s - 


0 s - 




6 s - 




0 s - 




0 s - 


0 s 


° . 










8 s - 


0 s 


0 s 




0 s " 


0 s - 




CO 






LO 




00 




CO 


CO 


CM 


CM 


o 


0 s 

CM 
CD 




0 s " 

CO 
00 


h- 


CD 


lO 


Cp 

0 s 


co 


CD 


CM 




in 


I s - 




CM 




I s - 


00 


m 


LO 


CO 








CO 




I s - 


CM 


CO 


CM 


p 




d 










d 


o 


d 




T— 




d 


CD 

m 




CD 
LO 






CD 


CO 


d 


CM 






CD 


CO 




CD 




CO 


CO 




CO 




5 


CO 




5 




LO 




CO 


CO 






































I s - 












I s - 




O 




CM 


-3- 


m 


CD 






CO 


1562 1 




CM 


CD 


CO 




? 


00 


CO 


I s - 




CM 


o 








CO 


CD . 


00 




o 




m 




CO 


00 




^* 


CM 






o 






O) 






CO 


CO 


CD 


CD 


CD 


m 




r- 

LO 


CM 






LO 


in 




CM 




!§ 










CO 


CO 

-— . ■ 


CO 


LO 


lp 


m 






LO 


LO 




I s - 


4S 


LO 


CO 




CO 


CO 








CO 


LO 


o 


— — 
CM 




5i 


00 


936/ 






in 


r- 




I s - 


in 


00 


I s - 






m 








00 


CO 


in 


CO 


m 


m 


CO 








00 


CD 


CO 




r- 


LO 




o 


00 




o 




o 


o 


o 


CO 


CO 


CO 


CD 




CD 


CM 






CO 


co 


co 


CD 




CO 


CM 




CO 




CM 


CM 


CM 


CO 


CO 


CO 






CO 


CO 


CM 




co 


CO 


























































\0 




--P 


csr 








Co* 


Co 










vO 


Co* 


Co* 




nP 


■vp 




0 s - 


0 s - 




0^ 




0 s - 






o- 


0 s 










C? 


0 s - 


6 s - 


0 s - 


0 s - 


0 s 


0 s 


0 s - 






«t 




00 




1^ 






CD 


in 


CO 


0 s 
CD 
CM 


0 s 

CM 
CO 




0 s 


00 


in 




co 


CM 


in 


CO 






CO 




o 




CO 


CM 


p 


CM 


T— 


o 




CO 






o 


CO 




o 


! I s - 




CD 


iri 




d 










d 


d 


d 




CD 




CD 






CD 


t— 




CM 


d 






m 




in 




in 


LO 


LO 


in 


m 


in 


O 

in 










2, 


LO 




LO 


in 


























2, 






















h- 


h- 




o 




CM 




LO 


CD 






CO 

in 
in 


CM 
CO 

in 




CM 


CD 


CO 


Tl- 


I s - 


00 


CO 


I s - 




CM 


o 




I s - 




CO 


CD 


CO 




o 






r- 


CO 


00 


I s - 






CM 


o 




O 


«t 




CD 






CO 


CO 


CD 


CD 


CD 




lO 


CM 








m 




CM 




LO 










CO 


CO 


CO 


LO 


LO 






^* 


LO 






to 


LO 


LO 


CO 






CM 




CD 




CO 


CO 


o> 


CM 




§ 




55 




LO 


CO 




I s - 




CO 




00 




00 


in 




00 




CD 


CO 


CM 


CD 


CO 


CO 


00 

I s - 








LO 






CO 


CM 


CM 














I s - 


I s - 


I s - 


CD 


CD 


CD 


I s - 




I s - 




CO 


o 


CO 


00 


00 


CO 




CM 


CM 




CM 










CM 


CM 


CM 










CM 


CM 


CM 


CM 


CM 


CM 


* 




o 


O 




O 




o 


o 


o 


O 


o 


O 


o 


o 




o 


o 


o 


O 


o 


O 


O 


o 




o 


O 




O 




o 


o 


o 


O 


o 


O 


o 


o 




o 


o 


o 


O 


o 


o 


O 


o 




CM 


CM 




CM 




CM 


CM 


CM 


CM 


CM 


CM 


CM 


CM 




CM 


CM 


CM 


CM 


CM 


CM 


CM 


CM 




LU 


UJ 




LU 




LU 


LU 


LU 


UJ 


LU 


LU 


UJ 


LU 




LU 


LU 


LU 


LU 


LU 


LU 


UJ 


UJ 




CO 


CO 




CO 




CO 


CO 


CO 


CO 


CO 


CO 


CO 


CO 




CO 


CO 


CO 


r aa 


r aa 


)96aa 


CO 


CO 




CO 


CO 




CO 




CO 


CO 


CO 


■ CO 


CO 


CO 


CO 


CO 




CO 


CO 


CO 


CO 


co 




00 


GO 




I s - 




h- 


o 


I s - 




o 


I s - 


CM 


CO 




O 


00 




i ^~ 


i ^~ 


I s - 


CM 




C\J 






I s - 




h- 




o 






O 


CD 






r- 




CD 


CD 


T— 

CD 


CD 


CD 




CD 


o 










in 


in 




LO 


in 




CO 






o 


I s - 


w 


I s - 








CO 




CD 




O) 


cd 


o> 


CD 


CD 


CD 


CO 


co 




co 


CO 


co 


O 


O 


T— 


CD 


CO 




CO 


CO 




CO 




CO 






CO 






in 


CO 




CO 


CO 








CO 




in* 




CO 


o 




CO 




CO 


o 


CM 


co 


O 


CM 


I s - 


CD 




CO 


o 


CO 






CO 


I s - 




CM 


CO 




I s - 




I s - 


in 


CD 


I s - 


in 


CD 


I s - 






CM 


CO 


CD 


in 

CO 


LO 
CO 


I s - 


CD 






CO 


a> 




y — 






CO 


o 


T— 


CO 


O 




CD 




CO 


CD 


O 




O 






CM 


CO 




I s - 






I s - 


CM 






CM 




CO 




CM 


CO 


CM 




CM 


I s - 




O 


00 




LL 




LL 




o 


LL 




O 


LL 


CO 




O 


CO 


O 


AAK/ 


AAK1 


AAF*/ 


O 


LL 




< 


< 




< 




< 


< 


< 


< 


< 


< 




< 




< 


< 


< 


< 


< 




< 


CO 




< 




< 


< 


o 


< 


< 


O 




CO 




< 


CO 


O 


O 


< 




I s - 










00 




















I s - 






CM 






I s - 














CM 






* LO 






in 








o 






CO 






CM 


























CO 








CM 












CM 




5 










CO 






co 














in 






in 






CO 




X 










X 






X 






X 








X 






X 






X 




CO 










CO 












CO 








CO 






CO 






CO 




















S 


























X. 




CL 










CL 






.CL 






CL 








CL 






CL 






CL 




O 
















CM 






CO 














in 






co 

















































deoxyoleandolide synthase.Streptomyces 
antibioticus . 


polyketide synthase.Streptomyces avermitilis 


AmphJ polyketide synthase.Streptomyces 
nodosus 


NysC polyketide synthase.Streptomyces 
noursei 


polyketide synthase.Streptomyces 
hygroscopicus 


AmphC polyketide synthase.Streptomyces 
nodosus 


NysC polyketide synthase.Streptomyces 
noursei 


PimS1 polyketide synthase.Streptomyces 
natalensis 


malonyl CoA-ACP transacylase.Bacillus 
haiodurans 


malonyl-CoA-ACP transacylase.Salmonella 
typhimurium 


malonyl-CoA-ACP transacylase.Streptomyces 
aureofaciens 


hypothetical protein, Nostoc sp. 


hypothetical protein.Synechocystis sp. I 


hypothetical protein.Chloroflexus aurantiacus 


hypothetical protein.Geobacter 
metallireducens 


hypothetical proteinjhermotoga maritima 


daunorubicin resistance protein, Pyrococcus 
furiosus 


StrL.Streptomyces glaucescens 


4-ketoreductase,Streptomyces antibioticus | 


dTDP-4-dehydrorhamnose 
reductase,Streptomyces nogalater 


epimerase,Streptomyces griseus 


1957/3237 (60.46%) 


1948/3170 (61.45%) 


3366/5719(58.86%) 


2755/4464 (61.72%) 


3074/5643 (54.47%) 


2273/3588 (63.35%) 


2280/3684 (61.89%) 


2241/3564 (62.88%) 


118/294 (40.14%) 


120/303 (39.6%) 


110/286 (38.46%) 


97/220 (44.09%) 


112/255 (43.92%) 


99/224 (44.2%) 


176/334 (52.69%) 


186/330 (56.36%) 


176/327 (53.82%) 


173/290 (59.66%) 


165/285 (57.89%) 


161/289 (55.71%) 


137/195 (70.26%) 


1643/3237 (50.76%) 


1612/3170 (50.85%) 


2761/5719(48.28%) 


2313/4464 (51.81%) 


2448/5643 (43.38%) 


1913/3588 (53.32%) 


1907/3684 (51.76%) 


1879/3564 (52.72%) 


72/294 (24.49%) 


73/303 (24.09%) 


74/286(25.87%) 


60/220(27.27%) | 


70/255 (27.45%) 


56/224 (25%) 


142/334 (42.51%) 


131/330 (39.7%) 


121/327 (37%) 


152/290 (52.41%) 


139/285 (48.77%) 


136/289 (47.06%) 


108/195 (55.38%) 


1 E-200 


1 E-200 


1 E-200 


1 E-200 


1 E-200 


1 E-200 


1 E-200 


1 E-200 


1E-09 


CO 

o 

LU 


1E-07 


6E-11 


2E-08 


CO 

o 

LU 
CO 


1E-54 


5E-52 


5E-49 


4E-73 


5E-65 


8E-63 


1E-56 


AAF82408.1,4150aa 


BAB69307.1,3352aa 


AAK73502.1.5644aa 


AAF71776.i,11096aa 


CAA60460.1,8563aa 


AAK73514.1,10917aa 


AAF71 776.1, 11096aa 


CAC20931.1,6797aa 


D83961,313aa 


CO 
CO 
Oi 
O 
CO 

CO 
CVJ 
r— 

o 


AAK60008.1,316aa 


AD2333,275aa 


CO 
CO 

Oi 
CVJ 

o" 

CVI 
CO 

CO 


ZP_00019722.1,251aa 


ZP_00080468.1,308aa 


D72257,327aa 


CO 

CO 
cvj 

CO 

cvi 

CO 
00 

">! 
CL 
Z 


CAA07388.1,305aa 


AAF59936.1,294aa 


AAF01815.1,291aa 


CAA44442,200aa 






7510 






3872 






CO 
CO 
CO 






CO 
00 
CVJ 






Oi 
CVI 
CO 






r— 

CO 






o 

CVJ 






PKSH 






PKSH 






AYTF 






MEAY 






ABCD 






DEPL 






EPIM ! 












CO 






0) 






o 

CVJ 






cvi 






CVJ 
CVJ 






CO 
CVJ 



epimerase.Streptomyces glaucescens 


epimerase.Streptomyces rishiriensis 


sugaractivating enzyme.Streptomyces griseus 


glucose-1 -phosphate 
thymidyltransferase.Streptomyces sp. 


dTDP-D-glucose synthase.Streptomyces 
antibioticus : 


putative dehydratase.Streptomyces coelicolor | 


dTDP-glucose 4,6-dehydratase.Streptomyces 
argillaceus 


dTDP-glucose 4,6-dehydratase.Streptomyces 
rimosus 


thioesterase.Streptomyces avermitilis | 


thioesterase.Streptomyces venezuelae | 


thioesterase.Amycolatopsis mediterranei 


hypothetical protein, Ralstonia metallidurans | 


hypothetical protein, Rhodobacter sphaeroides! 


acyl-CoA synthase,Mycobacterium leprae 


amino oxIdase.Streptomyces coelicolor 


hypothetical protein, Pseudomonas 
fluorescens ! 


tryptophan monooxygenase, Pseudomonas 
syringae 


hypothetical protein.Streptomyces coelicolor I 


phosphopantetheinyl 
transferase.Streptomyces verticillus 


hypothetical protein.Streptomyces sp. 


hypothetical protein.Streptomyces coelicolor | 


Sim18,Streptomyces antibioticus 


hypothetical protein, Mycobacterium 
tuberculosis 


transcriptional activator.Streptomyces 
venezuelae 


129/191 (67.54%) 


121/188 (64.36%) 


263/328 (80.18%) 


261/328 (79.57%) 


260/329 (79.03%) 


218/318(68.55%) 


218/317(68.77%) 


214/318(67.3%) 


97/239 (40.59%) 


95/242 (39.26%) 


88/225 (39.11%) 


188/466 (40.34%) 


192/474 (40.51%) 


195/495 (39.39%) 


383/522 (73.37%) 


369/521 (70.83%) 


370/521 (71.02%) 


132/226 (58.41%) 


127/228 (55.7%) 


109/214(50.93%) 


195/275 (70.91%) 


190/269 (70.63%) 


187/276 (67.75%) 


445/1014(43.89%) 


108/191 (56.54%) 


104/188 (55.32%) 


215/328 (65.55%) 


217/328 (66.16%) 


214/329 (65.05%) 


201/318(63.21%) 


200/317(63.09%) 


191/318(60.06%) 


74/239(30.96%) 


73/242 (30.17%) 


61/225 (27.11%) 


116/466 (24.89%) 


125/474 (26.37%) 


120/495 (24.24%) 


318/522 (60.92%) 


280/521 (53.74%) 


280/521 (53.74%) 


115/226 (50.88%) 


105/228 (46.05%) 


91/214(42.52%) 


169/275 (61.45%) 


163/269 (60.59%) 


159/276 (57.61%) 


331/1014(32.64%) 


2E-55 


1E-51 


1E-125 


1E-122 


1E-119 


1E-108 


1E-107 


1E-105 


7E-22 


3E-17 


3E-15 


1E-27 


5E-22 


o 

CVJ 
1 

CP 


1 E-200 


1E-172 


1E-171 


3E-50 


1E-43 


5E-36 


r>» 
q> 

LU 
CVI 


1E-91 


8E-89 


1E-113 


CO 
CO 

o 
o 

C\J_ 
CO 

I s - 
in 
m 
to 

O 


AAG29805.1,198aa 


CAA68514.1,355aa 


BAC55207.1,350aa 


AAF59934.1,356aa 


NP_625052.1.324aa 


CO 
CO 

s 

LO 
LO 

o 
O 


AAF82605.1,317aa 


BAB69315.1,255aa 


T1 741 3,281 aa 


AAC01 736.1, 254aa 


ZP_00025699.1,510aa 


ZP_00006768.1,501aa 


G87227,548aa 


CAB76876.1.565aa 


ZP_00086824.1,560aa 


ZP_001 26831. 1,559aa 


CAA1 9952.1, 226aa 


AAG43513.1,246aa 


BAA22407.1,208aa 


CAA19951.1,295aa 


AAL15596.1,293aa 


NP_217311.1,324aa 


AAC68887.1,928aa 






GO 

Si 

CO 






GO 

Si 

CO 






cvi 






o 






CO 
LO 
LO 






T- 

CO 
CVJ 






CO 

o 

CO 






CO 

o> 

O) 






NUTA 






DEPA 






TESA 






CALB 






TMOA 






PPTF 






UNAK 






REGD 






CVJ 






LO 
CVJ 






CO 
CVI 






CVJ 






GO 
CVJ 






O) 

cvj 






o 

CO 






CO 



co 
'</} 
c 

CD 

c 
o 

€ 

03 

c 

</> 

CD 

E 

•*— « 
CO 
lT 
O 
JO 

9? 
£ 

-Q 

z 



CO 

E 
o 

Q. 

CD 

55 

c 

*CD 

E " 



CO 
CD 

E 
£ 

Cl 

CO 
0) 

co 

CO 



o 

CO 

a <q 

w o 

CD ^ 

> o 

3 C 

Q. CO 



CO 
CD 

E 
o 

Q. 

CD 



CO co 

CD g 

CO © 

0 c? 

CO JZ 

P CD 

C CO 

1 2 



CO . 
CD 

o 
>* 

E 

2 

Q. 

2 

CO 
cd" 

CO 
CO 



CO 

is 

O .9 

8.? 

"E 'c 

CO CO 




si 

o ■= 
.E co 

It' 
Sf 

in ST 



o 
o 
o 

O o 

"D CO 
O ® 

.£ 2 
j? 55 

9 CD 

o .S> 



CO 
CO 

c 
o 
E 
o 

CO 

X 

o 

II 

CO c 
C CD 

8. O 

s « 



o 
o 
.2 
© 
o 
o 

CO 
CD 

& 

E 
o 

CD 

00 
c" 

'CD 

2 

Q. 

"c0 
O 

« 

o 

Q. 

>* 



C\J 
CM 



in 

CM 



CO 



00 



00 

Si 



co 

in 
oo 

CO 

o 
in 

04 



00 
CO 

CO 



5 

04 



r*« 

oo 
co 

in 
o 

LO 
O) 

00 



in 



CM 

oS 

CO 

o> 
in 
oo 

in 

CM 



O) I s -; 

cd in 

oo oo 



CM CM 

o5 co 



CO 



CM 



co 
oo 



m 



co 
oo 



oo 
oS 
S 

in 
oo 

CO 
CM 

cn 



co 
oS 

in 
oo 
oo 



05 



CD 
CO 

d 
in 

m 
o 

LO 

CO 

in 

CM 



oo 
oo 

in 



8 

o 

CM 



LU 



LU 



LU 



LU 



LU 



CO t- 

00 00 

LU LU 

CO i" 



LU 
00 



LU 



CO 
O 



LU 



CM 
O 



LU 



LU 



LU 



CO 



LU 



CO 
CO 

CM 

in 



tj- CO 

CO CM 



CM 

in 

CO 

z 



CO 
CO 

o> 

LO 



00 
CO 

i 



S 2 .2 



CO 
Q 
< 



CO 
CO 
LO 
CM 



00 

CM 

3 
< 
< 



CO 

co 
m 

3 



o 

CO 
00 
CM 

in 
CO 

< 

CD 



CO 

CO 



s 

CJ> 
00 

CO 

< 



CO 

o 
in 



CD 
—I 
< 
O 



CO 

oo 



CO 
CO 

co 

CM 
CM 



in 

in 
oo 

i 

o 



CO 

CO 



[0145]The gene product of each of ORFs 1-38 in the compound 2(a) locus is 
assigned a protein family based on sequence similarity to the structure of known 
proteins as determined in Table 1 . A putative function is attributed to each gene 
product of the compound 2(a) locus biosynthetic locus based on the known 
function of members of the respective protein families. Each protein family is 
referred to by a four-letter designation used throughout the description and 
figures. For example, members of protein family ABCD including the gene 
product of ORF 21 (SEQ iD NO: 43) are transmembrane transporters; members 
of protein family ADHY including the gene product ORF 33 (SEQ ID NO: 67) are 
amidinohydrolases; members of protein family ADSN including the gene product 
of ORF 34 (SEQ ID NO: 69) are adenylation/condensing enzymes; members of 
protein families AYTF and AYTP including ORFs 19 and 35 (SEQ ID NOS: 39 
and 71) are acyltransferases; members of protein family CALB are acyl CoA 
ligases including ORF 27 and 36 (SEQ ID NO: 55 and 73); members of protein 
family CTFC including ORF 32 (SEQ ID NO: 65) are 
carboxyltransferase/decarboxylases; members of protein families DEPA and 
DEPL including ORFs 25 and 22 (SEQ ID NOS: 51 and 45) are 
dehydratase/epimerases; members of protein family EPIM including ORF 23 
(SEQ ID NO: 47) are epimerises; members of protein family GTFA including ORF 
9 (SEQ ID NO: 19) are glycosyl transferases; members of protein family MEAY 
including ORF 20 (SEQ ID NO: 41) are membrane proteins; members of protein 
family NUTA including ORF 24 (SEQ ID NO: 49) are nucleotidyltransferases; 
members of protein family PKSH including ORFs 10, 11, 12, 13, 14, 15, 16, 17 
and 18 (SEQ ID NOS: 21, 23, 25, 27, 29, 31, 33, 35 and 37) are polyketide 
synthase, type I proteins; members of PPTF protein family including ORF 29 
(SEQ ID NO: 59) are phosphopantetheinyl transferases; members of protein 
family REGD including ORFs 3 and 31 (SEQ ID NOS: 6 and 63) are 
transcriptional regulators; members of protein family RREB including ORF 4 
(SEQ ID NO: 8) are response regulators; members of protein family SPKK 
including ORF 5 (SEQ ID NO: 10) are sensory protein kinases; members of 
protein family TESA including ORFs 2 and 26 (SEQ ID NOS: 4 and 53) are 
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thioesterases; and members of protein family TMOA including ORF 28 (SEQ ID 
NO: 57) are monooxygenases. A more detailed description of the function of 
each protein family is provided in Table 2. The correlation between structure and 
function for each protein family is provided in Table 2. 
Table 2 



Protein 
Family 


Function 


ABCD 


ABC transporter; ATP-binding cassette transmembrane transporter; includes 
proteins with similarity to Mdr proteins of mammalian tumor cells that confer 
resistance to chemotherapeutic agents. 


ADHY 


amidinohydrolase;,agmatine ureohydrolase; hydrolyzes linear amidines; requires 
manganese for catalysis and contains a conserved His important for catalytic 
function 


ADSN 


Adenylating/condensing synthase; amide synthase; enzymes able to activate 
substrates as acyl adenylates and subsequently transfer the acyl group to an 
amino group or me acceptor moiecuie 


AYTF 


acyltrainsferase; acyl CoA-acyl carrier protein transacylase; includes malonyl 
CoA-ACP transacylases 


AYTP 


acyltransferase; pyridoxal phosphate-dependent; includes 5-aminolevulinate 
synthase, a glycyl transferase that condenses glycine and succinyl-CoA. 


CALB 


acyl CoA ligase; shows similarity to plant coumarate CoA ligases, other aryl CoA 
ligases, yeast CoA synthetase and aminocoumarin ligases. 


CTFC 


carboxyltransferase/decarboxylase; carboxyltransferase component of acetyl- 
CoA carboxylase, generally a 2 subunit component, this family consists of a 
fusion of the beta and alpha subunits (beta-alpha). 


DEPA 


dehydratase/epimerase; dTDP-glucose 4,6-dehydratases, catalyze the second 
step in 6-deoxyhexose biosynthesis. 


DEPL 


denydratase/epimerase, similar to btrL d I Ur-ainyarostreptose synthase, uieu 
4-ketoreductase; SnogC putative dTDP-4-dehydrorhamnose reductase 


EPIM 


epimerase; NDP-hexose epimerase; TDP-4-ketohexose- 3,5-epimerases, 
convert TDP-4-keto-6-deoxy-D-glucose to TDP-4-keto-6-deoxy-L-mannose 
(TDP-4-keto-L-rhamnose). 


GTFA 


glycosyl transferase. 


MEAY 


membrane protein; putative transporter, permease 


NUTA 


nucleotidyltransferase; dNDP-glucose synthase; alpha-D-glucose-1 -phosphate 
thymidylyltransf erase; catalyze the first step in 6-deoxyhexose biosynthesis. 


PKSH 


polyketide synthase, type I. 


PPTF 


phosphopantetheinyl transferases, required for activation of both PKSs and 
NRPSs from inactive apo forms to active holo forms. 


REGD 


transcriptional regulator I 
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RREB 


response regulator; similar to response regulators that are known to bind DNA 
and act as transcriptional activators 


SPKK 


sensory protein kinase. 


TESA 


thioesterase. 


TMOA 


monooxygenase; strong similarity to plasmid-encoded tryptophan-2- 
monooxygenases. 


UNAK 


unknown; homolog of S. coelicolor hypothetical protein 


UNEW 


unknown; similar to putative integral membrane protein in S. coelicolor 


UNEX 


unknown; domain homology to many bacterial putative membrane proteins; 
contain so-called "bacterial membrane flanked domains" found in an 
uncharacterised family of membrane proteins that have one to three copies of 
the domain flanked by transmembrane helices. j 


UNFI 


unknown; similar to putative membrane proteins 



[0146] Biosynthesis of Compound 2(a) involves the multimodular type I polyketide 
synthase system (PKS) of ORFs 10 to 18 (SEQ ID NOS: 21 , 23, 25, 27, 29, 31 , 
33, 35 and 37) illustrated in Figure 1 . Type I PKSs are large modular proteins 
that condense acyl thioester units in a sequential manner. PKS systems consist 
of one or more polyfunctional polypeptides each of which is made up of modules. 
Each type I PKS module contains three domains; a p-ketoacyl protein synthase 
(KS), an acyltransferase (AT) and an acyl carrier protein (ACP). Domains 
conferring additional enzymatic activities such as ketoreductase (KR), 
dehydratase (DH) and enoylreductase (ER) can also be found in the PKS 
modules. These additional domains result in various degrees of reduction of the 
p-keto groups of the growing polyketide chain. Each module is responsible for 
one round of condensation and reduction of the p-ketoacyl units. There is a 
direct correlation between the number of modules and the length of the 
polyketide chain as well as between the domain composition of the modules and 
the degree of reduction of the polyketide product. The final polyketide product is 
released from the PKS protein through the action of a thioesterase domain found 
in the ultimate module of the PKS system. The genetic organization of most type 
I PKS enzymes is colinear with the order of biochemical reactions giving rise to 
the polyketide chain. One skilled in the art will readily understand that these 
features allow prediction of polyketide core structure based on the architecture of 
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the PKS modules found in a given biosynthetic pathway [Hopwood, Chem. Rev., 
97:2465-2497(1997)]. . 

[0147] The compound 2(a) locus PKS system is composed of ORFs 10 to 18 
(SEQ ID NOS: 21 , 23, 25, 27, 29, 31 , 33, 35 and 37) and comprises a total of 27 
modules described in Table 3.. The first module contains only an ACP domain 
and corresponds to the loading module (module 0) whereas each of the 
remaining 26 modules contain domains KS, AT and ACP in various combinations 
with KR, DH and ER domains. The thioesterase domain present in ORF 
18/module 26 indicates that this module is the ultimate one in the biosynthesis of 
the polyketide chain. Dehydratase domains in modules 6 and 1 1 as well as 
ketoreductase domain in module 12 appear to be inactive due to the presence of 
non-conservative amino acid residues in highly conserved regions important for 
catalysis. 

Table 3 

compound 2(a) locus PKS domain coordinates 



SEQ ID NO 


Amino Acid 


Nucleic Acid 


Homology 


Module 


Amino acid/ 


Residue 






no. 


Nucleic acid 










21/22 


57-118 


169-354 


ACP 


o 


21/22 


141-566 


421-1698 


KS 




21/22 . 


597-1031 


1789-3093 


AT 


1 
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3910-4551 


KR 
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4807-4992 


ACP 
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1690-2118 


5068-6354 


KS 




21/22 


2135-2562 


6403-7686 
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2 
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8497-9135 


KR 




21/22 


3130-3191 


9388-9573 


ACP 




21/22 


3215-3640 


9643-10920 


KS 
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3660-4089 


10978-12267 


AT 
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4102-4208 


12304-12624 


DH 


3 


21/22 


4612-4829 


13834-14487 


KR 




21/22 


4911-4972 


14731-14916 


ACP 
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KS 


27/28 


484-920 


1450-2760 


AT 


27/28 


1195-1406 


3583-4218 


KR* 
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27/28 

14 . 29/30 

29/30 
29/30 
29/30 

29/30 
29/30 
29/30 
29/30 
29/30 

29/30 
' 29/30 

29/30 
. 29/30 

29/30 

15 31/32 
31/32 
31/32 
31/32 
31/32 

31/32 
. 31/32 
31/32 
31/32 
31/32 

31/32 
31/32 
31/32 
31/32 
31/32 

16 33/34 
33/34 
33/34 
33/34 

33/34 
33/34 
33/34 
33/34 



1490-1551 


4468-4653 


ACP 


35-460 
487-918 
1219-1431 
1514-1575 


103-1380 
1459-2754 
3655-4293 
4540-4725 


KS 
AT 
KR 
ACP 


1602-2027 
2046-2473 
2486-2592 
2980-3196 
3287-3339 


4804-6081 
6136-7419 
7456-7776 
8938-9588 
9832-10017 


KS 
AT 
DH 
KR 
ACP 


3363-3788 
3810-4237 
4249-4355 
4760-4976 
5060-5124 


10087-11364 
11428-12711 
12745-13065 
14278-14928 
15187-15372 


KS 
AT 
DH 
KR 
ACP 


35-460 
480-914 
926-1032 
1423-1639 
1737-1798 


103-1380 
1438-2742 
2776-3096 
4267-4917 
5209-5394 


KS 
AT 
DH 
KR 
ACP 


1822-2247 
2263-2690 
2703-2809 
3188-3404 
3483-3544 


5464-6741 

6787-8070 

8107-8427 

9562-10212 

10447-10632 


KS 
AT 
DH 
KR 
ACP 


3568-3993 
4017-4442 
4456-4562 
4978-5194 
5285-5346 


10702-11979 
12049-13326 
13366-13686 
14932-15582 
15853-16038 


KS 
AT 
DH 
KR 
ACP 


35-460 
481-917 
1205-1416 
1500-1561 


103-1380 
1441-2751 
3613-4248 
4498-4683 


KS 
AT 
KR 
ACP 


1585-2010 
2067-2505 
2786-2998 
3083-3144 


4753-6030 
6199-7515 
8356-8994 
9247-9432 


KS 
AT 
. KR 
ACP 
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35/36 


40-465 


118-1395 


KS 


35/36 


503-941 


1507-2823 


AT 


35/36 . 


954-1060 


2860-3180 


DH 


35/36 


1456-1672 


4366-5016 


KR 


35/36 


1751-1812 


5251-5436 


ACP 


35/36 


1835-2260 


5503-6780 


KS 


35/36 


2281-2718 


6841-8154 


AT 


35/36 


2731-2837 


8191-8511 


DH 


35/36 


3188-3546 


9562-10638 


ER 


35/36 


3551-3767 


10651-11301 


KR 


oo/oo 


3846-3907 


11536-11721 


ACP 


35/36 


3932-4357 


11794-13071 


KS 


35/36 


4373-4803 


13117-14409 


AT 


35/36 


4815-4921 


14443-14763 


DH 


35/36 • 


5300-5516 


15898-16548 


KR 


35/36 


5597-5658 


16789-16974 


ACP 


35/36 


«JOOU O 1 1 1 




r\o 


35/36 




1ft^Q1-1Qfi71 


AT 


35/36 




1Q71 4-pnn^4 


L/n 


35/36 




£. 1 1 OH C 1 OOH 


KR 

rvn 


35/36 


7363-7424 


22087-22272 


ACP 


37/38 


34-459 


100-1377 


KS 


37/38 


502-926 


1504-2778 


AT 


37/38 


938-1044 


2812-3132 


DH 


37/38 


1420-1636 


4258-4908 


KR 


37/38 


. 1715-1776 


5143-5328 


ACP 


37/38 


1799-2224 


5395-6672 


KS 


37/38 


2247-2673 


6739-8019 


AT 


37/38 


2686-2792 


8056-8376 


DH 


37/38 


3203-3419. 


9607-10257 


KR 


37/38 


3513-3574 


10537-10722 


ACP 


37/38 


3649-3872 


10945-11616 


TE 



[01 48] One skilled in the art would understand that all KS domains are functional 
as the multiple amino acid alignment of KS domains present in the compound 
2(a) locus PKS system (Figure 2) shows an overall similarity of domains and 
conservation of amino acid residues and domain regions important for activity. 
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Similarly, multiple amino acid alignment of AT domains (Figure 3), ER domains 
(Figure 5), ACP domains (Figure 7) and TE domains (Figure 8) show an overall 
similarity of related domains and a high conservation of protein regions and of 
amino acid residues important for catalytic activity. The domains that occur only 
once in the compound 2(a) locus PKS, namely the enoyl reductase (ER) domain 
in ORF 17 (SEQ ID NO: 35) and the thioesterase (TE) domain in ORF 18 (SEQ 
ID NO: 37) are compared to prototypical domains from the nystatin type I 
polyketide system (Figures 5 and 8) (see Brauteset era/., supra). 
[01 49] Comparison of DH domains found in the compound 2(a) locus PKS 
indicates a high conservation of amino acid residues important for catalytic 
activity (Figure 4). However, two DH domains are inactive as they contain non- 
conservative amino acid substitutions in a region of high sequence conservation. 
As highlighted in Figure 4, the DH domain of module 6 in ORF 1 1 (SEQ ID NO: 
23) and the DH domain of module 1 1 in ORF 12 (SEQ ID NO: 25) contain 
substitutions of charged amino acids arginine and glutamic acid respectively for 
non-charged aliphatic amino acids. 

[01 50] Comparison of KR domains found in the compound 2(a) locus PKS 
system also displays a conservation of active sites and amino acid residues 
important for catalysis with the exception of the KR domain of module 12 found in 
ORF 13 (SEQ ID NO: 27). Figure 6 shows the presence in that module of a 
substitution of a glutamine (Q) for a highly conserved tyrosine (Y) amino acid 
residue. This non-conservative amino acid substitution results in the inactivation 
of the enzymatic activity of the KR domain of module 12 in ORF 13 (SEQ ID NO: 
27)(ORF13_pKR01). 

[0151]Phylogenetic analysis of the compound 2(a) locus PKS AT domains was 
conducted to assess the nature of the p-keto acyl units that are incorporated in 
the growing polyketide chain. The compound 2(a) locus PKS AT domains were 
compared to two domains, AAF71779mod03 and AAF71766mod1 1 , derived from 
the nystatin PKS system [Brautaset, supra] and specifying the incorporation of 
malonyl-CoA and methyl malonyl-CoA respectively. Figure 9 shows the 
phylogenetic relatedness of the various AT domains indicating that, in the 
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compound 2(a) locus PKS, ORF 13 (SEQ ID NO: 27) module 12 as well as ORF 
16 (SEQ ID NO: 33) modules 19 and 20 incorporate methylmalonate in the 
polyketide chain whereas all remaining AT domains incorporate malonate 
extender p-keto acyl units. 

[01 52] Domain analysis of the compound 2(a) locus PKS system provides clear 
indication as to synthesis of the polyketide core structure. While not intending to 
be limited to any particular mode of action or biosynthetic scheme, the nature and 
organization of the compound 2(a) locus PKS modules can explain the synthesis 
of Compound 2(a). Figure 10 highlights schematically a series of reactions 
catalyzed by the polyketide synthase system based on the correlation between 
the deduced domain architecture and the polyketide core of the compounds 2(a). 
Type I PKS domains and the reactions they carry out are well known to those 
skilled in the art and well documented in the literature; see for example, 
Hopwood, supra. 

[0153] A biosynthetic pathway for the production of the y-aminobutyryl-CoA 
starter unit is also shown. The gene product of ORF 28 (SEQ ID NO: 57), a 
member of protein family TMOA, catalyzes the decarboxylative oxidation of 
arginine forming 4-guanidinobutanamide. The gene product of ORF 33 (SEQ ID 
NO: 67), a member of protein family ADHY, catalyzes hydrolysis of the amidino 
group forming y-aminobutanamide that is further activated by either ORF 27 or 36 
(SEQ ID NOS: 55 and 73 respectively), both members of protein family CALB, to 
give y-aminobutyryl-CoA (Figure 10a). The gene product of ORF 19 (SEQ ID 
NO: SEQ ID NO: 39), a member of protein family AYTF, loads this unusual 
extender unit onto the ACP domain of the loading module (module 0) of ORF 10 
(SEQ ID NO: 21), a member of protein family PKSH, as illustrated in Figure 10b. 
The polyketide chain continues to grow by the sequential condensation of 
malonyl-CoA and methylmalonyl-CoA extender units that are further reduced by 
specific domains to various degrees. Dehydratase domains found in module 6 of 
ORF 1 1 (SEQ ID NO: 23) and module 1 1 of ORF 12 (SEQ ID NO: 25) as well as 
the ketoreductase domain found in module 12 of ORF 13 (SEQ ID NO: 27) are 
inactive and consequently do not catalyze their respective reductive reactions. 
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The mature polyketide chain is then released through the action of the 
thioesterase domain found in module 26 of ORF 18 (SEQ ID NO: 37), a member 
of protein family PKSH as illustrated in Figure 10b. The polyketide core structure 
expected from the architecture of the PKS domains of the compound 2(a) locus is 
entirely consistent with the polyketide portion of the compound 2(a). 
[01 54] The compound 2(a) locus contains genes involved in the synthesis of two 
other components found in the chemical structure of the compound 2(a) locus. 
Figure 1 1 a illustrates a biosynthetic pathway for the production of the 
aminohydroxy-cyclopente none moiety found in the compound 2(a) locus. The 
gene product of ORF 35 (SEQ ID NO: 71), a member of protein family AYTP, 
condenses glycine with succinyl-CoA forming 5-aminolevulinate. This 
intermediate is further activated through the action of either the gene products of 
ORF 27 or 36 (SEQ ID NOS: 55 and 73 respectively), both members of protein 
family CALB, forming 5-aminolevulinate-CoA that may spontaneously cyclize to 
produce aminohydroxycyclopentenone. This moiety is subsequently condensed 
to the activated carboxy terminus of the polyketide chain through the action of the 
gene product of ORF 34 (SEQ ID NO: 69), a member of protein family ADSN as 
illustrated in Figure 10c. 

[0155] Figure 1 1 b depicts the biosynthetic pathway of the deoxysugar component 
of Compound 2(a). The gene product of ORF 24 (SEQ ID NO: 49), a member of 
protein family NUTA, activates D-glucose forming dNDP-D-glucose that is 
subsequently dehydrated through the action of the gene product of ORF 25 (SEQ 
ID NO: 51), a member of protein family DEPA, forming dNDP-4-keto-4, 6- 
dideoxy-D-glucose. The gene product of ORF 22 (SEQ ID NO: 45), a member of 
protein family DEPL, further reduces this intermediate forming dNDP-D-fucose 
that is subsequently epimerized by the gene product of ORF 23 (SEQ ID NO: 
47), a member of protein family EPIM, producing dNDP-L-rhamnose. 
[01 56] The final deoxysugar moiety is transferred onto a hydroxyl group of the 
polyketide core structure through the action of a glycosyltransferase, i.e. the gene 
product of ORF 9 (SEQ ID NO: 19), a member of protein family GTFA, as 
illustrated in Figure 1 0c. Figure 10c proposes one scheme in regard to timing of 
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the reactions catalyzed by the gene product of ORF 34 (SEQ ID NO: 69), a 
member of protein family CALB, and by the gene product of ORF 9 (SEQ ID NO: 
19), a member of protein family GTFA. However, it will be readily understood 
that the invention does not reside in the actual timing and order of the reactions 
as depicted in Figure 10c. 

[01 57] Additional proteins forming the compound 2(a) locus include the gene 
product of ORF 2 (SEQ ID NO: 4) and a member of protein family TESA which is 
expected to having polyketide-priming editing functions; the gene products of 
ORFs 3, 4, 5 and 31 (SEQ ID NOS: 6, 8, 10 and 63), members of protein families 
REGD, RREB, SPKK and REGD respectively, are expected to regulate synthesis 
of Compound 2(a); the gene products of ORFs 6 and 21 (SEQ ID NOS: 12 and 
43), members of protein families UNEW and ABCD respectively, are involved in 
transmembrane transport; and the gene product of ORF 29 (SEQ ID NO: 59), a 
member of protein family PPTF, activates ACP domains through 
phosphopantetheinylation. 

[01 58] Structural modification of compound of Formula I and Formula II and 
Compound 2(a) are attained by the genetic modifications of the compound 2(a) 
locus. Genetic modifications of PKS biosynthetic loci are well known in the art. 
The WO 01/34816 patent publication teaches the construction of a library of 
structural variants of the macrolide polyketide rapamycin derived from the genetic 
modification of genes in the locus that directs rapamycin synthesis. The genetic 
modifications taught, include gene inactivation, gene insertion and gene 
replacement. These modifications, both individually and in combination at 
different positions within the rapamycin locus, resulted in alteration of polyketide 
starter units, chain length and hydroxyl sterospecificities in rapamycin. Similarly, 
McDaniel et.al. [Proc Natl Acad Sci USA, 1999, 96:18646-51] generated a library 
of over 50 derivatives of the macrolide antibiotic erythromycin using a 
combination of genetic modifications including gene inactivation, macrolide chain 
length and hydroxyl sterdspecificity modifications of the erythromycin 
biosynthesis genes. 
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[01 59] The elucidation of the nucleic acid sequences that encodes the 
biosynthesis of Compound 2a provides the biological tools to enable one skilled 
in the art to genetically modify the biosynthetic pathway to generate variants of 
the Compound 2a. In particular, Type I PKS systems may be manipulated by 
changing the number of modules, their specificities towards carboxylic acids, and 
by inactivating or inserting domains with reductive activities (Katz, Chem. Rev. v. 
97, 2557-2575, 1997). Thus, the polyketide synthase system of Compound 2(a) 
may be engineered by modifying, adding, or deleting domains, or replacing them 
with those taken from other Type I PKS enzymes. Compounds of Formula I may 
be produced using a modified PKS system created based on the polyketide 
synthase system for the production of Compound 2a. Preferred modified PKS 
systems are those wherein a KS, AT, KR, DH or ER domain has been inactivated 
or deleted. 

[0160] In one aspect, the invention is directed to preparation of a polyketide of 
Formula I or II resulting from a modified polyketide synthase system, which 
modification include deletions, mutagenesis, inactiyation or replacement of one or 
more of the domains of the invention. The modified polyketide synthase system 
produces compounds of Formula I that may differ from the compound of Formula 
2a in size, degree of saturation and oxidation. In another aspect, the invention is 
directed to compounds of Formula I or II produced by genetic modification of the 
polyketide synthase system for the compound 2(a) locus. 
[0161]The compounds of this invention may be formulated into pharmaceutical 
compositions comprised of compounds of Formula I in combination with a 
pharmaceutical^ acceptable carrier. 

[01 62] The compounds of this invention are useful in treating bacterial infections, 
fungal infections and cancer. 

[01 63] Molecular terms, when used in this application, have their common 
meaning unless otherwise specified. 

[01 64] The term alkyl refers to a linear or branched hydrocarbon group. 
Examples of alkyl groups include, without limitation, methyl, ethyl, n-propyl, 
isopropyl, n-butyl, pentyl, hexyl, heptyl, cyclopentyl, cyclohexyl, cyclohexymethyl, 
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and the like. Alkyl groups may optionally be substituted with one or more 
substituents selected from acyl', amino, acylamino, acyloxy, carboalkoxy, 
carboxy, carboxyamido, cyano, halo, hydroxyl, nitro, thio, alkyl, alkenyl, alkynyl, 
cycloalkyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, sulfinyl, sulfonyl, oxo, 
guanidino and formyl. 

[01 65] The term alkenyl refers to a linear, branched or cyclic hydrocarbon group 
containing at least one carbon-carbon double bond. Examples of alkenyl groups 
include, without limitation, vinyl, 1-propene-2-yl, 1-butene-4-yl, 2-butene-4-yl, 1- 
pentene-5-yl and the like. Alkenyl groups may optionally be substituted with one 
or more substituents selected from acyl, amino, acylamino, acyloxy, carboalkoxy, 
carboxy, carboxyamido, cyano, halo, hydroxyl, nitro, thio, alkyl, alkenyl, alkynyl, 
cycloalkyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, sulfinyl, sulfonyl, formyl, 
oxo and guanidino. The double bond portion(s) of the unsaturated hydrocarbon 
chain may be either in the cis or trans configuration. 
[01 66] The term cycloalkyl or cycloalkyl ring refers to a saturated or partially 
unsaturated carbocyclic ring in a single or fused carbocyclic ring system having 
from three to fifteen ring members. Examples of cycloalkyl groups include, 
without limitation, cyclopropyl, cyclobutyl, cyclohexyl, and cycloheptyl. Cycloalkyl 
groups may optionally be substituted with one ore more substituents selected 
from acyl, amino, acylamino, acyloxy, carboalkoxy, carboxy, carboxyamido, 
cyano, halo, hydroxyl, nitro, thio, alkyl, alkenyl, alkynyl, cycloalkyl, heterocyclyl, 
aryl, heteroaryl, alkoxy, aryloxy, sulfinyl, sulfonyl and formyl. 
[01 67] The term heterocycloalkyl, heterocyclic or heterocycloalkyl ring refers to a 
saturated or partially unsaturated ring containing one to four hetero atoms or 
hetero groups selected from O, N, NH, NR X , PO2, S, SO or S0 2 in a single or 
fused heterocyclic ring system having from three to fifteen ring members. 
Examples of heterocycloakyl groups include, without limitation, morpholinyl, 
piperidinyl, and pyrrolidinyl. Heterocycloalkyl groups may optionally be 
substituted with one or more substituents selected from acyl, amino, acylamino, 
acyloxy, oxo, thiocarbonyl, imino, carboalkoxy, carboxy, carboxyamido, cyano, 
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halo, hydroxyl, nitro, thio, alkyl, alkenyl, alkynyl, cycloalkyl, heterocyclyl, aryl, 
heteroaryl, alkoxy, aryloxy, sulfinyl, sulfonyl and formyl. 
[01 68] The term amino acid refers to a natural amino acid, a synthetic amino acid 
or a synthetic derivative of a natural amino acid. Examples of natural amino 
acids include, but are not limited to alanine, arginine, asparagine, aspartic acid, 
cysteine, glutamic acid, glutamine, glycine, histidine, isoleucine, leucine, lysine, 
methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine and 
valine. 

[0169]The term halo is defined as a bromine, chlorine, fluorine or iodine atom. 
[0170] The term aryl or aryl ring refers to an aromatic group comprising a single 
or fused ring system, having from five to fifteen ring members. Examples of aryl 
groups include, without limitation, phenyl, naphthyl, biphenyl, terphenyl. Aryl 
groups may optionally be substituted with one or more substituent group selected 
from acyl, amino, acylamino, acyloxy, azido, alkythio, carboalkoxy, carboxy, 
carboxyamido, cyano, halo, hydroxyl, nitro, thio, alkyl, alkenyl, alkynyl, cycloalkyl, 
heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, sulfinyl, sulfonyl and formyl. 
[0171] The term heteroaryl or heteroaryl ring refers to an aromatic group 
comprising a single or fused ring system, having from five to fifteen ring members 
and containing at least one hetero atom such as O, N, S, SO and SO2. 
Examples of heteroaryl groups include, without limitation, pyridinyl, thiazolyl, 
thiadiazoyl, isoquinolinyl, pyrazolyl, oxazolyl, oxadiazoyl, triazolyl, and pyrrolyl 
groups. Heteroaryl groups may optionally be substituted with one or more 
substituent groups selected from acyl, amino, acylamino, acyloxy, carboalkoxy, 
carboxy, carboxyamido, cyano, halo, hydroxyl, nitro, thio, thiocarbonyl, alkyl, 
alkenyl, alkynyl, cycloalkyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, sulfinyl, 
sulfonyl, and formyl. 

[0172] As used herein, the term "treatment" refers to the application or 
administration of a therapeutic agent to a patient, or application or administration 
of a therapeutic agent to an isolated tissue or cell line from a patient, who has a 
disorder, e.g., a disease or condition, a symptom of disease, or a predisposition 
toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, 
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ameliorate, improve, or affect the disease, the symptoms of disease, or the 
predisposition toward disease. 

[01 73] As used herein, a "pharmaceutical composition" comprises a 
pharmacologically effective amount of a farnesyl dibenzodiazepinone and a 
pharmaceutically acceptable carrier. As used herein, "pharmacologically 
effective amount," "therapeutically effective amount" or simply "effective amount' 
refers to that amount of a farnesyl dibenzodiazepinone effective to produce the 
intended pharmacological, therapeutic or preventive result. For example, if a 
given clinical treatment is considered effective when there is at least a 25% 
reduction in a measurable parameter associated with a disease or disorder, a 
therapeutically effective amount of a drug for the treatment of that disease or 
disorder is the amount necessary to effect at least a 25% reduction in that 
parameter. 

[01 74] The term "pharmaceutically acceptable carrier" refers to a carrier for 
administration of a therapeutic agent. Such carriers include, but are not limited 
to, saline, buffered saline, dextrose, water, glycerol, ethanol, and combinations 
thereof. The term specifically excludes cell culture medium. For drugs 
administered orally, pharmaceutically acceptable carriers include, but are not 
limited to pharmaceutically acceptable excipients such as inert diluents, 
disintegrating agents, binding agents, lubricating agents, sweetening agents, 
flavoring agents, coloring agents and preservatives. Suitable inert diluents 
include sodium and calcium carbonate, sodium and calcium phosphate, and 
lactose, while corn starch and alginic acid are suitable disintegrating agents. 
Binding agents may include starch and gelatin, while the lubricating agent, if 
present, will generally be magnesium stearate, stearic acid or talc. If desired, the 
tablets may be coated with a material such as glyceryl monostearate or glyceryl 
distearate, to delay absorption in the gastrointestinal tract. 
[0175] Pharmaceutically acceptable salts include acid addition salts and base 
addition salts. The nature of the salt is not critical, provided that it is 
pharmaceutically-acceptable. Without being limited, examples of acid addition 
salts include hydrochloric, hydrobromic, hydroiodic, nitric, carbonic, sulphuric, 
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phosphoric, formic, acetic, citric, tartaric, succinic, oxalic, malic, glutamic, 
propionic, glycolic, gluconic, maleic, emboriic (pamoic), methanesulfonic, 
ethanesulfonic, 2-hydroxyethanesulfonic, pantothenic, benzenesulfonic, 
toluenesulfonic, sulfanilic, mesylic, cyclohexylaminosulfonic, stearic, algenic, p- 
hydroxybutyric, malonic, galactantic, galacturonic acid and the like. Suitable 
pharmaceutically-acceptable base addition salts of compounds of the invention 
include, but are hot limited to, metallic salts made from aluminium, calcium, 
lithium, magnesium, potassium, sodium and zinc or organic salts made from 
N,N'-dibenzylethylenediamine, chloroprocaine, choline, diethanolamine, 
ethylenediamine, N-methylglucamine, lysine, procaine and the like. Additional 
examples of pharmaceutical^ acceptable salts are listed in Journal of 
Pharmaceutical Sciences, 1977, 66:2. All of these salts may be prepared by 
conventional means form the corresponding compounds of Formula I by treating 
with the appropriate acid or base. 

[01 76] The compounds of the present invention can possess one or more 
asymetric carbon atoms and can exist as optical isomers forming mixtures of 
racemic or non-racemic compounds. The compounds of the present invention 
are useful as a single isomer or as a mixture of stereochemical isomeric forms. 
Diastereoisomers, i.e., nonsuperimposable stereochemical isomers, can be 
seperated by conventional means such as chromatography, distillation, 
crystallization and sublimation. The optical isomers can be obtained by 
resolution of the racemic mixtures according to conventional processes. 
[01 77] The invention embraces isolated compounds. An isolated compound 
refers to a compound which represents at least 10%, 20%, 50% and 80% of the 
compound of the present invention present in a mixture, provided that the mixture 
comprising the compound of the invention has demonstrable (i.e. statistically 
significant) biological activity including antibacterial, antifungal or anticancer 
activity when tested in conventional biological assays known to a person skilled 
in the art: 

[01 78] The compounds of the present invention, or pharmaceutical^ acceptable 
salts thereof, can be formulated for oral, intravenous, intramuscular, 
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subcutaneous, topical or parenteral administration for the therapeutic or 
prophylactic treatment of diseases, particularly bacterial and fungal infections. 
For oral or parental administration, compounds of the present invention can be 
mixed with conventional pharmaceutical carriers and excipients and used in the 
form of tablets, capsules, elixirs, suspensions, syrups, wafers and the like. The 
compositions comprising a compound of this present invention will contain from 
about 0.1% to about 99.9%, about 5% to about 95%, about 10% to about 80% or 
about 15% to about 60% by weight of the active compound. 
[01 79] The pharmaceutical preparations disclosed herein are prepared in 
accordance with standard procedures and are administered at dosages that are 
selected to reduce, prevent, or eliminate bacterial and fungal infection or the 
cancer (See, e.g., Remington's Pharmaceutical Sciences, Mack Publishing 
Company, Easton, PA and Goodman and Gilman's the Pharmaceutical Basis of 
Therapeutics, Pergamon Press, New York, NY, the contents of which are 
incorporated herein by reference, for a general description of the methods for 
administering various antimicrobial agents for human therapy). The compositions 
of the present invention can be delivered using controlled (e.g., capsules) or 
sustained release delivery systems (e.g., bioerodable matrices). Exemplary 
delayed release delivery systems for drug delivery that are suitable for 
administration of the compositions of the invention (preferably of Formula I) are 
described in U.S. Patent Nos 4,452,775 (issued to Kent), 5,239,660 (issued to 
Leonard), 3,854,480 (issued to Zaffaroni). 

[01 80] The pharmaceutically-acceptable compositions of the present invention 
comprise one or more compounds of the present invention in association with 
one or more non-toxic, pharmaceutically-acceptable carriers and/or diluents 
and/or adjuvants and/or excipients, collectively referred to herein as "carrier" 
materials, and if desired other active ingredients. The compositions may contain 
common carriers and excipients, such as corn starch or gelatin, lactose, sucrose, 
microcrystalline cellulose, kaolin, mannitol, dicalcium phosphate, sodium chloride 
and alginic acid. The compositions may contain crosarmellose sodium, 
microcrystalline cellulose, sodium starch glycolate and alginic acid. 
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[01 81] Lubricants that can be used include magnesium stearate or other metallic 
stearates, stearic acid, silicon fluid, talc, waxes, oils and colloical silica. 
[01 82] Flavouring agents such as peppermint, oil of wintergreen, cherry 
flavouring or the like can also be used. It may also be desirable to add a coloring 
agent to make the dosage form more esthetic in appearance or to help identify 
the product comprising a compound of the present invention. 
[01 83] For oral administration, the pharmaceutical compositions are in the form 
of, for example, a tablet, capsule, suspension or liquid. For oral use, solid 
formulations such as tablets and capsules are particularly useful. Sustained 
released or enterically coated preparations may also be devised. Tablet binders 
that can be included are acacia, methylcellulose, sodium carboxymethylcellulose, 
polyvinylpyrrolidone (Providone), hydroxypropyl methylcellulose, sucrose, starch 
and ethylcellulose. For pediatric and geriatric applications, suspension, syrups 
and chewable tablets are especially suitable. The pharmaceutical composition is 
preferably made in the form of a dosage unit containing a therapeutically- 
effective amount of the active ingredient. Examples of such dosage units are 
tablets and capsules. For therapeutic purposes, the tablets and capsules can 
contain, in addition to the active ingredient, conventional carriers such as binding 
agents, for example, acacia gum, gelatin, polyvinylpyrrolidone, sorbitol, or 
tragacanth; fillers, for example, calcium phosphate, glycine, lactose, maize- 
starch, sorbitol, or sucrose; lubricants, for example, magnesium stearate, 
polyethylene glycol, silica or talc: disintegrants, for example, potato starch, 
flavoring or coloring agents, or acceptable wetting agents. Oral liquid 
preparations generally are in the form of aqueous or oily solutions, suspensions, 
emulsions, syrups or elixirs may contain conventional additives such as 
suspending agents, emulsifying agents, non-aqueous agents, preservatives, 
coloring agents and flavoring agents. Examples of additives for liquid 
preparations include acacia, almond oil, ethyl alcohol, fractionated coconut oil, 
gelatin, glucose syrup, glycerin, hydrogenated edible fats, lecithin, methyl 
cellulose, methyl or propyl para-hydroxybenzoate, propylene glycol, sorbitol, or 
sorbic acid. 



85 



3004- 



[01 84] For intravenous (IV) use, compounds of the present invention can be 
dissolved or suspended in any of the commonly used intravenous fluids and 
administered by infusion. Intravenous fluids include, without limitation, 
physiological saline or Ringer's solution. 

[01 85] Formulations for parental administration can be in the form of aqueous or 
non-aqueous isotonic sterile injection solutions or suspensions. These solutions 
or suspensions can be prepared from sterile powders or granules having one or 
more of the carriers mentioned for use in the formulations for oral administration. 
The compounds can be dissolved in polyethylene glycol, propylene glycol, 
ethanol, corn oil, benzyl alcohol, sodium chloride, and/or various buffers. 
[01 86] For intramuscular preparations, a sterile formulation of compounds of the 
present invention or suitable soluble salts forming the compound, can be 
dissolved and administered in a pharmaceutical diluent such as Water-for- 
Injection (WFI), physiological saline or 5% glucose. A suitable insoluble form of 
the compound may be prepared and administered as a suspension in an 
aqueous base or a pharmaceutically acceptable oil base, e.g. an ester of a long 
chain fatty acid such as ethyl oleate. 

[0187] For topical use the compounds of present invention can also be prepared 
in suitable forms to be applied to the skin, or mucus membranes of the nose and 
throat, and can take the form of creams, ointments, liquid sprays or inhalants, 
lozenges, or throat pajnts. Such topical formulations further can include chemical 
compounds such as dimethylsulfoxide (DMSO) to facilitate surface penetration of 
the active ingredient. 

[0188] For application to the eyes or ears, the compounds of the present 
invention can be presented in liquid or semi-liquid form formulated in hydrophobic 
or hydrophilic bases as ointments, creams, lotions, paints or powders. 
[0189] For rectal administration the compounds of the present invention can be 
administered in the form of suppositories admixed with conventional carriers such 
as cocoa butter, wax or other glyceride. 

[01 90] Alternatively, the compound of the present invention can be in powder 
form for reconstitution in the appropriate pharmaceutically acceptable carrier at 
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the time of delivery. In another embodiment, the unit dosage form of the 
compound can be a solution of the compound or a salt thereof in a suitable 
diluent in sterile, hermetically sealed ampoules. 

[01 91] The amount of the compound of the present invention in a unit dosage 
comprises a therapeutically-effective amount of at least one active compound of 
the present invention which may vary depending on the recipient subject, route 
and frequency of administration. A recipient subject refers to a plant, a cell 
culture or an animal such as an ovine or a mammal including a human. 
[01 92] According to this aspect of the present invention, the novel compositions 
disclosed herein are placed in a pharmaceutical^ acceptable carrier and are 
delivered to a recipient subject (including a human subject) in accordance with 
known methods of drug delivery. In general, the methods of the invention for 
delivering the compositions of the invention in vivo utilize art-recognized 
protocols for delivering the agent with the only substantial procedural modification 
being the substitution of the compounds of the present invention for the drugs in 
the art-recognized protocols. 

[0193] Likewise, the methods for using the claimed composition for treating cells 
in culture, for example, to eliminate or reduce the level of bacterial or fungal 
contamination of a cell culture, utilize art- recognized protocols for treating cell 
cultures with antibacterial or antifungal agent(s) with the only substantial 
procedural modification being the substitution of the compounds of the present 
invention for the agents used in the art-recognized protocols. 
[01 94] The compounds of the present invention provide a method for treating 
bacterial infections, fungal infections and pre-cancerous or cancerous conditions. 
As used herein the term unit dosage refers to a quantity of a therapeutically- 
effective amount of a compound of the present invention that elicits a desired 
therapeutic response. As used herein the phrase therapeutically-effective 
amount means an amount of .a compound of the present invention that prevents 
the onset, alleviates the symptoms, or stop's the progression of a bacterial 
infection, fungal infection or pre-cancerous or cancerous condition. The term 
treating is defined as administering, to a subject, a therapeutically-effective 
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amount of at least one compound of the present invention, both to prevent the 
occurrence of a bacterial or fungal infection or pre-cancer or cancer condition, or 
to control or eliminate a bacterial or fungal infection or pre-cancer or cancer 
condition. The term desired therapeutic response refers to treating a recipient 
subject with a compound of the present invention such that a bacterial or fungal 
infection or pre-cancer or cancer condition is reversed, arrested or prevented in a 
recipient subject. 

[01 95] The compounds of the present invention can be administered as a single 
daily dose or in multiple doses per day. The treatment regime may require 
administration over extended periods of time, e.g., for several days or for from 
two to four weeks. The amount per administered dose or the total amount 
administered will depend on such factors as the nature and severity of the 
infection, the age and general health of the recipient subject, the tolerance of the 
recipient subject to the compound and the type of the bacterial or fungal infection, 
or type of cancer. 

[01 96] A compound according to this invention may also be administered in the 
diet or feed of a patient or animal. The diet for animals can be normal foodstuffs 
to which the compound can be added or it can be added to a premix. 
[01 97] The compounds of the present invention may be taken in combination, 
together or separately with any known clinically approved antibiotic, anti-fungal or 
anti-cancer to treat a recipient subject in need of such treatment. 
[0198] Compounds of Formula I are obtained biosynthetically by culturing 
Actinomycetes species in growth media described in Table 4, at temperatures 
between 24° C - 34° C and with shaking to aerate of the culture medium for 3 to 
40 days. The compounds of Formula I are extracted and isolated from the 
bacterial culture by methods known to a skilled person including centrifugation, 
chromatography, adsorption, filtration, extraction or other methods of separation. 
[01 99] The compounds of Formula I may be biosynthesized by various 
microorganisms. Microorganisms that may synthesize the compounds of the 
present invention include but are not limited to bacteria of the order 
Actinomycetales, also referred to as actinomycetes. Non-limiting examples of 
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members belonging to the genera of Actinomycetes include Nocardia, 
Geodermatophilus, Actinoplanes, Micromonospora, Nocardioides, Saccharothrix, 
Amycolatopsis, Kutzneria, Saccharomonospora, Saccharopolyspora, 
Kitasatospora, Streptomyces, Microbispora, Streptosporangium, Actinomadura. 
The taxonomy of actinomycetes is complex and reference is made to Goodfellow 
(1989) Suprageneric classification of actinomycetes, Bergey's Manual of 
Systematic Bacteriology, Vol. 4, Williams and Wilkins, Baltimore, pp 2322-2339, 
and to Embley and Stackebrandt, (1994), and The molecular phylogeny and 
systematics of the actinomycetes, Annu. Rev. Microbiol. 48, 257-289 (1994), for 
genera that may synthesize the compounds of the invention, incorporated herein 
in their entirety by reference. 

[0200] Microorganisms biosynthetically producing compounds of Formula I are 
cultivated in culture media containing known nutritional sources for 
actinomycetes having assimilable sources of carbon, nitrogen plus optional 
inorganic salts and other known growth factors at a pH of about 6 to about 9, 
non-limiting examples of growth media are provided in Table 4 below. 
Microorganisms are cultivated at incubation temperatures of about 20° C to about 
40° C for about 3 to about 40 days. 
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Table 4. Examples of Growth Media for Production of Compounds of Formula I 
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Unless otherwise indicated all the ingredients are in gm/L 

* 3 Trace elements solution contains: ZnCI 2 40 mg; Fe CI 3 6H 2 0 (200 mg); CuCI 2 2H 2 0 (10 mg); 

MnCI 2 .4H 2 0; Na 2 B4O7.10H 2 O (10mg); (NH 4 ) 6 M07O24 .4H 2 0 (10 mg) per litre. 

* 4 Dissolve components in 800 ml water and autoclave, later add: 10 ml KH 2 P0 4 (0.5% solution); 

80 ml CaCI 2 .2H 2 0 (3.68 % solution); 15 ml L-proline (20% solution); 100 ml TES buffer ( 5.73% 

solution, pH 7.2); 5 ml NaOH (1N solution), and 2 ml of trace elements solution. 

* 5 The pH is to be adjusted as marked prior to the addition of CaC0 3 in those media containing it. 
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[0201] The culture media inoculated with the microorganisms which 
biosynthetically produce compounds of Formula I, may be aerated by incubating 
the inoculated culture media with agitation, for example shaking on a rotary 
shaker, or a shaking water bath. Aeration may also be achieved by the injection 
of air, oxygen or an appropriate gaseous mixture to the inoculated culture media 
during incubation. 

[0202] After cultivation and production of compounds of Formula I, the 
compounds can be extracted and isolated from the cultivated culture media by 
techniques known to a skilled person in the art and/or disclosed herein, including 
for example centrifugation, chromatography, adsorption. For example, the 
cultivated culture media can be mixed with a suitable organic solvent such as n- 
butanol, n-butyl acetate and 4-methyl-2-pentanone, the organic layer can be 
separated for example, by centrifugation followed by the removal of the solvent, 
by evaporation to dryness or by evaporation to dryness under vacuum. The 
resulting residue can optionally be reconstituted with for example water, ethanol, 
ethyl acetate, methanol or a mixture thereof, and re-extracted with a suitable 
organic solvent such as hexane, carbon tetrachloride, methylene chloride or a 
mixture thereof. After removal of the solvent, the compound of Formula I can be 
further purified by the use of standard techniques such as chromatography. 
[0203] The compounds of Formula I that are biosynthesized by microorganisms 
may optionally be subjected to random and/or directed chemical modifications to 
form compounds that are derivatives or structural analogs of compounds of 
Formula I. Derivatives or structural analogs of compounds of Formula I having 
similar functional activities are within the scope of the present invention. 
Compounds of Formula I may optionally be modified using methods known in the 
art and described herein. 

[0204] Unless otherwise indicated, all numbers expressing quantities of 
ingredients and properties such as molecular weight, reaction conditions, IC50 
and so forth used in the specification and claims are to be understood as being 
modified in all instances by the term "about" . Accordingly, unless indicated to the 
contrary, the numerical parameters set forth in the present specification and 
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attached claims are approximations. At the very least, and not as an attempt to 
limit the application of the doctrine of equivalents to the scope of the claims, each 
numerical parameter should at least be construed in light of the number of 
significant figures and by applying ordinary rounding techniques. 
[0205] Notwithstanding that the numerical ranges and parameters setting forth 
the broad scope of the invention are approximations, the numerical values set in 
the examples, Tables and Figures are reported as precisely as possible. Any 
numerical values may inherently contain certain errors resulting from variations in 
experiments, testing measurements, statistical analyses and such. 
[0206]The compounds of Formula I, Formula II and compound 2(a) may 
optionally be chemically modified using methods known in the art and described 
herein. 

[0207] The compounds of the invention are made by biofermentation and well- 
known chemical schemes. The schemes described herein are exemplary, any 
chemical synthetic process known to a person skilled in the art providing the 
structures described herein, may be used and are therefore comprised in the 
present invention. 

SCHEME 1 Acylation Reactions 

EDC = 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide 

Protective groups include N-benzyloxycarbonyl (CBZ), N-butoxycarbonyl (BOC), 

N-fluoren-9-ylmethoxycarbonyl (FMOC) 

R x represents Ci- 6 alkyl, C 2 -6 alkenyl, aryl or heteroaryl 

AA represents a naturally occurring amino acid 
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Scheme 2. Aminations/reductive aminations of t rminal nitrogen 

R 1 as previously defined q 
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f Shift's base 
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NaBH 3 CN 
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Scheme 3. Olefin reactions 
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Scheme 4. Ketone reactions 

R 1 and R 8 are as previously defined. 




H 2 0 



Scheme 5. 0- Reactions 

R 1 , R 5 and R 6 are as peviously defined. 
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Scheme 6. Hydrolysis/Esterification 
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[0208]Scheme 1 is used to obtain Compounds 2(m), 2(n), 2(o), 2(p), 2(q), 2(r), 
2(s), 2(t), 2(u), 2(v), 2(w), 2(x), 2(y), 2(z), 2(aa), and 2(ab) from Compound 2(a). 
[0209] Scheme 3 is used to obtain Compound 2(b) from Compound 2(a). 
[0210] Scheme 4 is used to obtain Compounds 2(c), 2(d), 2(e) and 2(f) from 
Compound 2(a). 

[0211]Scheme 6 is used to obtain Compounds 2(g), 2(h), 2(i) and 2(j) from 
Compound 2(a). 

Example 1 Production of Compound 2(a) by Fermentation 

Example 1 (A): Preparation of Strain fC03U031023 

f021 2] Strain rC031023 : Sfreptomyces aizunensis NRRL B-1 1277 was plated on 
three tomato paste oatmeal agar (ATCC medium 1360) plates for sporulation at 
28 °C. The plates were incubated for a period of 5-7 days, after which spores 
were collected from each plate into 5 ml sterile distilled water, spun down by 
centrifugation at 5000 rpm (10 min), and dispersed in 20 ml sterile water. After a 
second centrifugation under the same conditions the pellet was resuspended in 
10 ml sterile distilled water. A series of ten-fold dilutions of the original spore 
suspension were prepared and 0.5 ml aliquots plated on tomato paste-oat meal 
agar until sporulation occurred (5-7 days). Each individual clone from the plates 
with single well-isolated colonies (generated from 10' 8 to 10" 10 dilutions of the 
spore suspension) was chosen and transferred to one plate of tomato paste-oat 
meal agar to generate spores for storage. Each clone was grown in 25x150 mm 
glass tubes for its production of Compound 2(a). A total of 385 clones were 
tested for production levels of Compound 2(a). Clone [C03]023 showed a 
production of 3 times better than the wild-type strain. This clone was chosen, 
stored, and used for mutagenesis. 

[021 3] Strain rC03U031023 : An aqueous spore suspension of [C03]023 was 
mutagenized by UV radiation (254 nm) at different energy levels (expressed as 
m Joules per surface area). Clone [C03U03]023 obtained at 0.4 mJ/1 cm 2 showed 
slightly more than three times better production than the parent clone [C03J023. 
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Production of Compound 2(a) by the new clone has been consistently 
reproducible both in shaken flask (500 ml medium QB or VA in 2-L baffled flasks) 
and in 100-L fermentors with medium VA. 

Example KB) Activation of Ivophilized sample of Strain fC03U031023 
[021 4] Strain [C03U03]023 was provided as a lyophilized pellet. The lyophilized 
sample was opened under aseptic conditions, and 0.3-0.5 ml of medium ITSB 
was added to the sample to make a cell suspension. The cell suspension was 
transferred to 25 ml of medium ITSB (described below) in a 125-ml flask to form 
a liquid culture. The liquid culture was incubated at 28 °C for 3-5 days until visible 
growth occurred. Purity of the culture was tested by streaking a loop on ISP2 
agar plate. 

Example 1(C): Preparation and Storage of glycerol stocks of Strain rC03U031023 
[021 5] Strain [C03U03]023 was grown for 7-10 days at 28°C on several tomato 
paste-oat meal agar plates. Surface growth was collected from each plate into 5 
ml sterile distilled water, spun down by centrifugation at 5000 rpm (10 min), and 
dispersed in 10 ml sterile water. After a second centrifugation under the same 
conditions the pellet was resuspended in 2 ml sterile 25% glycerol and 0.5-ml 
aliquots were stored at -80 °C in screw-capped vials. In addition to the glycerol 
stocks, the collected cell mass could be resuspended in 15% sterile skim milk 
and dispensed in 0.5-ml aliquots into glass ampoules and lyophilized following 
standard procedures. 

Example 1(D): Preparation of Seed Culture 

[021 6] A vial containing frozen mycelia prepared as described in Example 1(C) 
was taken out of freezer and kept on dry ice. Under aseptic conditions, a loopfull 
of the frozen culture was taken and streaked on the surface of tomato paste-oat 
meal agar plate and incubated at 28°C until vegetative mycelium appeared (5-7 
days). In order to start the seed culture, 2-3 loopfull of the surface growth 
obtained from the tomato paste-oat meal agar plate was transferred to a 1 .5-ml 
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Eppendorf tube containing 300 ul of medium ITSB. The mycelium with agar 
fragments was homogenized, and 1 ml of medium ITSB was added to the 
suspension. The content was used to inoculate two 125-ml flasks containing 25 
ml of sterile medium ITSB. The flasks were incubated at 28°C for 65-70 hours in 
a rotary shaker at 250 rpm. This seed culture was then used to inoculate 
production medium QB or VA. 

Example 1(E): Production of Compound 2(a) bv Fermentation 
[021 7] A sample of the seed culture prepared as described in Example 1(D) 
above was checked microscopically for any possible contamination. A sample of 
the seed culture was then streaked onto one ISP2 plate (control plate) and 
incubated at 28 °C. From the seed culture under aseptic conditions, 10 ml was 
taken and used to inoculate each 2 Liter baffled flask containing 500 ml of sterile 
medium QB or VA. The fermentation batches were incubated aerobically with 
shaking (250 rpm) at 28°C for a period of 7 days. After 3-5 days of incubation the 
control plate was checked for purity of the culture. 

[0218]The compositions of the growth media used in Examples 1 (A) - 1 (E) are 
given below. Note that either of Production media QB or VA may be used in the 
production of Compound 2(a); however, production medium VA is preferred when 
conducting the fermentation on a large scale. 



Seed Medium ITSB : 

Trypticase Soy Broth (Difco) 30 g 

Yeast extract (Sigma) 3g 

MgS0 4 (Sigma) 2 g 

Glucose (Sigma) 5g 

Maltose (Sigma) 4g 

Distilled water 1 L 

Production Medium VA 

Glucose 50g 

Soybean Flour 30g 
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CaC0 3 6g 

NaCI 5g 

(NH 4 ) 2 S0 4 3g 

Distilled water 1 L 

Production Medium QB : 

Soluble starch (Sigma) 10 g 

Glucose (Sigma) 12 g 

Pharmamedia (Traders protein) 10 g 

Corn steep liquor (Sigma) 5 g 

Proflo oil (Traders Protein) 4 mL * 

Distilled water 1 L 
* Adjust pH to 7.2, then add Proflo oil 

Tomato paste Oatmeal Agar : 

Baby Oatmeal Food (Heinz) 20 g 

Tomato Paste 20 g 

Agar 15 g 

Tap water 1 L 
pH7.0 



[0219]The production of Compound 2(a) may also be carried out in the 
production media having the compositions as indicated in Table 4, supra, in order 
of preference. 

Example 2 Isolation of Compound 2(a) 

[0220]Thirty minutes prior to harvest of Compound 2(a) from the fermentation 
broth of the baffled flasks of Example 1 E, regenerated, water-washed, Diaion 
HP-20® in a quantity of wet-packed volume equal to 12% of the initial 
fermentation beer volume was added to the whole fermentation broth of Example 
1 E and modest agitation was continued for 30 minutes. At harvest the 



100 



3004-9US 



fermentation broth from 2 x 500 ml flasks was centrifuged and the supernatant 
was decanted from the resin and mycelia pellet. The pellet was resuspended in 
15% MeOH in water (half the original fermentation beer volume), agitated mildly 
and recentrifuged, and the surpernatant was decanted from the residue. The 
residue was washed a second time in the same manner with another 15% MeOH 
in water, followed by a single final wash with methanol:water (7:3 v/v) (half the 
original fermentation beer volume) to obtain a well-washed residue. The well- 
washed mycelia:resin residue was extracted three times with 100% ethanol, each 
extract being at 20% original beer volume. The three extracts were combined 
and concentrated under vacuum on a rotary evaporator, to dryness. 
[0221] The three extracts (representing material from 2 x 500 ml flasks) were 
combined, filtered on paper and concentrated under vacuo to remove organic 
solvents. The resulting semi-solid residue (aqueous suspension) of crude 
Compound 2(a) represented greater than 90% of the respective compounds 
produced and was about 25% pure. The aqueous suspension was freeze-dried 
overnight to give 460 mg of a dark brown solid. The solid was stirred with 10 ml 
of methanol and centrifuged for 2 minutes to remove insoluble matter. 
[0222] The semi-solid residue of crude Compound 2(a) was then purified using a 
Waters Xterra® preparative MS C-18 column with 10 urn packing of dimensions 
19 mm diameter x 150 mm length, using the following gradient table (Table 5) 
from 5mM aqueous ammonium bicarbonate to acetonitrile. 

Table 5: 



Time (min) 


% Aqueous 


% Acetonitrile 


0 


70 


30 


5 


45 


55 


10 


70 


30 



[0223] The eluate was monitored at 390 nm, a single run was loaded with 23 mg 
of crude residue in 0.5 ml of methanol, and a conservative cut of the peak e luting 
at 3.4 minutes afforded compound 2(a). Nineteen runs were conducted to yield 
33 mg of product with about 95% purity. 
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Exampl 3 Structural D t rmination of Compound 2(a) 
[0224] The structure of compound 2(a) was determined by a combination of 
genomic information and spectroscopic data, including Mass, UV, and NMR 
spectroscopy. The Mass was determined by electrospray mass spectrometry to 
be 1297 (Figure 13) and the UV k max were found to be 319, 333, 350 (Figure 14). 
The NMR data were collected at 500 MHz with the compound 2(a) dissolved in 
MeOH-c/4, and included proton (Figure 15A), carbon-13 (Figure 15B), and 
multidimensional pulse sequences gDQCOSY, gHSQC, gHMBC, and TOCSY 
(Figures 1 5C, 1 5 D, 1 5E and 1 5F, respectively). 

[0225] Streptomyces aizunensis NRRL B-1 1277 was grown on oat meal agar 
plates for 5-7 days. The surface growth was collected and washed with water, 
and DNA was extracted following standard procedures (T. Kiesser etal. Practical 
Streptomyces Genetics, The John Innes Foundation, Norwich, UK, 2000). The 
genomic library was produced in cosmid and plasmid vectors, and the genome 
was scanned for the presence of gene sequence tags (GSTs) related to the 
biosynthesis of secondary metabolites as described in E. Zazopoulos et al., 
Nature Biotechnology 21:187-190 (2003). The GSTs were used to isolate 
cosmids containing the compound 2(a) locus. The PKS system found within the 
compound 2(a) locus was determined to contain 9 PKS genes containing 27 
modules. (The analysis of this PKS system is fully described elsewhere herein; 
see, e.g.* Table III and accompanying text). Full analysis of the PKS and 
associated genes led to the prediction of a structure of Formula 1 below. 




The position of the glycosidic linkage to the sugar moiety could not be 
determined by the genomic analysis; however, the positioning of the 
aminohydroxycyclopentenone unit was determined by analogy with its placement 
in other actinomycete metabolites (Colabomycin A from Streptomyces 
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griseoflavus Tue 2880, J. Antibiot. 1 988, 41,11 78-85, 1 1 86-1 1 95 or Enopeptin-A 
from Streptomyces griseus, Osada et al., J. Antibiot. 44, 1463-6 1991). 
[0226]To obtain expression of these genes; and the end product of this 
biosynthesis pathway, S. aizunensis NRRL B-1 1277 was grown in several 
different media designed for the production of secondary metabolites in shaken 
flasks. At harvest the broth was diluted with an equal volume of methanol to 
induce cell lysis, and the diluted, clarified broth was concentrated 10 fold. An 
aliquot (50 uL) from the concentrate from each medium was chromatographed on 
a Waters Xterra C-18 HPLC column (19 x 150 mm) at a flow rate of 1mL/min and 
monitored by diode array detector (DAD) UV and positive and negative ion MS. 
Fractions (800 uL) were collected and tested for antimicrobial activity against a 
panel of indicator strains. From the extracts of several different media, HPLC 
fractions in the number 39 to 45 region exhibited strong activity against Candida 
albicans and this correlated with a UV absorption A max 319, 333, and 351 nm, and 
with strong MS peaks at m/z 1298 (positive ion mode) and 1296 (negative ion 
mode). These physical characteristics were entirely consistent with a metabolite 
of formula 1 . 

[0227]A high yielding medium was chosen and the organism was regrown on a 
2-liter scale. The compound 2(a) was extracted from the mycelial pellet with 
methanol and acetone, and from the broth with Diaion HP-20® resin, from which it 
was recovered with methanol after the resin had been washed with 
methanol/water 3:2. The crude extracts were purified by HPLC on a Waters 
Xterra C-18 column (19 x 150 mm) using an aqueous (5 mM ammonium 
bicarbonate) / acetonitrile gradient. 

[0228] Compound 2(a), a yellow solid of MW 1297 Da (C70H108N2O20 requires 
1296.75) A max 319, 334, and 351 nm was the subject of a series of 1D and 2D 
NMR measurements including a CMR, ^-NMR, gDQCOSY, gHSQC, gHMBC, 
TOCSY, gHSQCTOXY, and several 1 D TOCSY experiments. See Figures 15A - 
15E. Analysis of these spectra led to the assignments shown for compound 2(a) 
in Figure 17. Although considerable overlap of signals rendered unambiguous 
assignments of all of the signals to specific protons and carbons impractical, 
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those that could be made unambiguously confirmed the structure predicted from 
the genomics. A major cross peak in the gHMBC spectrum between the well 
separated proton resonance at 4.01 ppm and the anomeric carbon at 102.6 ppm 
placed the sugar as shown, as this proton falls within a 14 carbon section of the 
major chain with fully assigned carbon and proton signals. A well resolved carbon 
spectrum with high signal to noise ratio showed that the unassigned methylene 
carbons were at 42.0, 45.3, 45.4 and 46.6 ppm. Analysis by gHSQC indicates 
that that these were attached to protons at 2.24, 1 .62, 1 .50 and 1 .68, and 1 .55 
ppm respectively. Similarly the unassigned carbinols at 66^2, 66.2 (resolved), 
67.2 and 69.0 ppm attached to protons at 4.06, 4.08, 4.22 and 3.89 ppm 
respectively ahd the unassigned olefinic carbons at 129.1, 131.0, 131.9, 133.3, 
133.7, 134.3, 134.8, 136.5, and 138.0 ppm attached to protons at 5.72, 5.72, 
6.28, 6.25, 6.28, 6.25, 6.19, 5.53, and 5.86 respectively. The 
aminohydroxycyclopentenone signals were not straightforward and reflected the 
tautomeric equilibrium of this moiety. The upfield methylene signal and the 
downfield carbonyl signals were only 10% of the intensity of those from the other 
tautomer. The signal from C-1 of this moiety was not detected, a phenomenon 
which has been previously ascribed to tautomerization for the same structural 
unit. See, He, H.; Shen, B.;, Korshalla, J.; Siegel, M.M.; Carter.G.T. J. Antibiot. 
2000,53,191-195. 

Example 4 Minimal Inhibitory Concentration (MIC) Determination for 
Compound 2(a) 

[0229]The MIC determination for fungal and bacterial organisms was performed 
using the broth microdilution assay adapted from National Committee for Clinical 
Laboratory Standards (NCCLS) M27-A (Vol. 17 No. 9, 1997), Reference Method 
for Broth Dilution Antifungal Susceptibility Testing of Yeasts; Approved Standard 
guidelines: M23-A: Reference Method for Broth Dilution Antifungal Susceptibility 
Testing of Filamentous Fungi; Approved Standard, vol. 22, No. 16. 
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Materials: 

1 ) Overnight broth cultures of bacterial and fungal strains to be tested; 

2) Stock solution of Compound 2(a) at 3.2 mg/ml in DMSO; 

3) Standard 96 well round-bottom plates, sterile; 

4) Cation adjusted Mueller-Hinton broth, or Brain Heart Infusion broth (for 
antibacterial testing); 

5) Morpholinepropanesulfonic acid (MOPS)-buffered RPMI-1640 medium (for 
antifungal testing); 

6) Sterile isotonic saline (0.85%); 

7) McFarland 0.5 Barium Sulfate Turbidity Standard at 100 X 3.2mg/ml. 

r02281 Test compound preparation : The test article was prepared as 100x stock 
solutions in DMSO, with concentrations ranging from 3.2 mg/ml to 0.0625 mg/ml 
(a two-fold dilution series over 10 points). The first dilution (3.2mg/ml) was 
prepared by resuspending 0.5 mg of each test article in 156.25 ul of DMSO. The 
stock is then serially diluted by two-fold increments to obtain the desired 
concentration range. 

[0229] Inoculum preparation: For fungal strains, the inoculum was prepared as 
follows. From an overnight culture in Yeast Media broth, cell density was 
adjusted in 0.85% saline to 0.5 McFarland. This procedure yielded a stock 
suspension of about 5 X 10 6 cells/ml. Following thorough vortexing, a working 
suspension was prepared by diluting the stock 1 :50 in RPM1 1640, and then 
further diluting it 1 :20 with RPM1 1640 to obtain the 2x test inoculum (about 5 X 
1 0 3 cells/ml). For filamentous fungi, the inoculum was prepared as follows. From 
a spore suspension kept at 4°C, an appropriate dilution in 0.85% saline was 
made to obtain a final optical density 600 between 0.09-0.1 1 . A working 
suspension was then prepared by diluting the spore suspension 50 times in 
RPMI to obtain the 2x test inoculum (about 1 X 105 CFU/ml). 
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r02301 MIC Determination : The 100X test article solutions were diluted 50 times 
in RPMI 1640, MH or BHI media and dispensed in a 96 well plate, one 
concentration per column, 10 columns in total. The 1 1th column contained RPMI 
1640 with 1% DMSO with cells, the 12th column contained 100 pi of RPM1 1640 
alone. 

[0231] 50 pi of the final cell dilution (yeast, filamentous fungi or bacteria) of each 
indicator strain was added to each corresponding well of the microplate 
containing 50 pi of diluted drug or media alone. Assay plates were incubated at 
35°C for up to 72 hrs. MIC readings were determined at 24 and 48 hrs for the 
Candida and Aspergillus species, and at 48 and 72 hrs for Cryptococcus 
heoformans. MIC readout for each indicator was determined as the lowest 
concentration of test compound resulting in total absence of growth. 



Table 6: MIC (pg/ml) for Compound 2(a) for various strains of yeast and fungi 





MIC (pg/ml) 


Yeasts and filamentous funqi 


24 hrs 


48 hrs 


Candida albicans 
ATCC 10231 


4 


4 


Candida krusei 
LSPQ 0309 


8 


8 


Candida glabrata 
LSPQ 0250 


4 


8 


Candida lusitaniae 
ATCC 200953 


4 


4 


Saccharomyces cerevisiae 
ATCC 9763 


4 


4 


Cryptococcus neoformans 
ATCC 32045 


2* 


4** 


Aspergillus flavus 
ATCC 204304 


4 


8 


Aspergillus fumigatus ATCC 204305 


16 


16 



* 48 hrs reading; ** 72 hrs reading 
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Example 5. In vitro activity of compound 2(a) against Aspergillus species 
[0232]To determine the antifungal activity of compound 2(a) against Aspergillus 

species (A fumigatus and A. flavus) a disk diffusion assay was used to 

determine the minimum effective concentration (MEC) as described by Wong 

GK, Griffith S, Kojima I and Qemain AL. Antifungal activities of rapamycin and its 

derivatives, prolylrapamycin, 32-desmethylrapamycin, and 32- 

desmethoxyrapamycin. J. Antibiotics, 51(5): 487-491 ,1998. Such assay is 

commonly used to reveal activity of antifungal drugs against filamentous fungi 

such as Aspergillus sp. (Arikan S, Yurdakul P, Hascelik G. Comparison of two 

methods and three end points in determination of in vitro activity of micafungin 

against Aspergillus spp. Antimicrobial Agents and Chemotherapy 47(8): 2640- 

2643,2003). 

f02331 Preparation of the inoculum: After spreading on YM agar (in cell culture 
flasks), Aspergillus strains (A. flavus - ATCC 204304 and A. fumigatus • LSPQ 
204305) were left sporulating for 4 to 5 days at 35°C. After the addition of 10 to 
20 ml of saline solution (0.85% NaCI), spores were collected by gently rubbing 
the surface of the conidiophores with a disposable inoculation loop. Aspergillus 
spore suspensions, kept at 4°C, were used as the inoculum for the disc assays. 
[0234] Preparation of the disks : Stock solutions (5 mg/ml) in methanol and 
dilutions (0.25, 0.5, 1.0, 2.5, 5.0, 7.5, 10.0 and 50.0 ng/ml), prepared by serial 
dilutions of stock solution in methanol were prepared for the test article and each 
of the control compounds. Itraconazole and casponfungin were used as positive 
controls while fluconazole or DMSO alone were used as negative controls. Drug- 
containing disks were prepared by spotting of 10 \i\ of the proper drug solution (or 
methanol as control) onto filter disks that were then allowed to air-dry. 
r02351 Aqar plate preparation: Aspergillus spore suspensions were adjusted to 
about 81% of transmittance at 530 nm in saline solution. 200 ^l of the adjusted 
inoculum was then mixed with 50 ml of melted 0.8% YM agar (cooled to ~50°C), 
mixed thoroughly and poured in a 150 mm Petri dish. Once the agar was set, the 
prepared filters were loaded onto the plates, which were incubated at 35°C. The 
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zone of inhibition (ZOI) of fungal growth was measured after 24 hours of 
incubation. 

F/02361 Results: Data presented in Table 7 show the lowest concentration (MEC) 
inducing inhibition of the fungal growth and the corresponding ZOI obtained at 
this concentration for compound 2(a) and the controls. Results demonstrated that 
compound 2(a) was active against Aspergillus fumigatus and Aspergillus flavus. 
Similar effect was obtained for itraconazole and caspofungin while fluconazole 
was inactive. 



Table 7 





Aspergillus fumigatus 


Aspergillus flavus 




MEC 


ZOI 


MEC 


ZOI 




(ng/ml) 


(mm) 


(ug/ml) 


(mm) 


Methanol 


0 


0 


0 


0 


Compound 2(a) 


2.5 


2.7 


2.5 


2.7 


Itraconazole 


1.0 


1.7 


0.5 


1.7 


Casponfungin 


2.5 


0.7 


2.5 


0.7 


Fluconazole 


0 


0 


0 


0 



MEC : mimimum effective concentration 

ZOI : zone of inhibition of fungal growth calculated for each MEC 

Example 6. Evaluation of Antifungal Activity of Compound 2(a) in a Mouse 
Model of Disseminated Candidiasis 

[0237] Compound 2(a) was provided as a dry powder with an estimated purity of 
95+%. Fungizone (amphotericin B desoxycholate, to be used as a comparitor), 
was also, provided as a dry powder with an estimated purity of 95+%. The 
compound 2(a) and Fungizone were stored as dry powders at -80 °C until the 
day of administration. 
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[0238] Female mice (species Mus musculus, strain CD-1, Charles River) with 
body weight range of 22-24 g were used in the study. The animals were 
observed for 3 days before treatment. All animal experiments were performed at 
the Ste-Justine Hospital (Montreal, Quebec) according to ethical guidelines of 
animal experimentation of the ethical committee of the hospital. During the 
study, dead or apparently sick animals were promptly removed and sick mice 
were euthanized upon removal from the cage. 

[0239]The animals were maintained in rooms under controlled conditions of 
temperature (23±2°C), humidity (45±5%), photoperiodicity (12 hrs light / 12 hrs 
dark) and air exchange. The animals were housed in polycarbonate cages 
(4/single cage) equipped to provide food and water. Sterile wood shavings were 
used for animal bedding and the bedding was replaced every other day. Food 
(Harlam Tecklab, Canada) and autoclaved tap water was provided ab libitum, the 
food being placed in the metal lid on top of the cage. Water bottles were 
equipped with rubber stoppers and sipper tubes and were cleaned, sterilized and 
replaced once a week. 

[0240] Six groups of mice (10 mice per group) were infected intravenously with 3 
x 10 6 CFU of C. albicans SC5314 as previously described (see Dubois, N., et al., 
Microbiology 1 998, 144: 2299-2310). Twenty-four hours after infection, each 
individual group of mice was treated with Compound 2(a) (1 or 3 mg/kg i.p.), 
Fungizone (0.25, 0.5 or 1 mg/kg i.p.) as comparitor, or sham-treated with sterile 
water containing 5% dextrose and 3% DMSO. Each animal received 100 ul of 
test solution. 

[0241 ]The treatment regimen was repeated once daily for a total of 4 days. The 
mice were observed twice daily for signs of morbidity over 21 days. Moribund 
animals were scored as non-survivors and euthanized by C0 2 inhalation. The 
Kaplan and Meier product limit estimate was used to analyze survival data and 
plot the survival function. 
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Table 8: Survival Rates Over Time After Inoculation with Compound 2(a) and 
Fungizone 



Groups 


Treatment 


Dose (mg/kg) 


Median 
survival 


1 


Vehicle 




5 days 


2 


Compound 
2(a) 


1.0 


8.5 days 


3 


Compound 
2(a) 


3.0 


20 days 


4 


Fungizone 


0.25 


>21 days 


5 


Fungizone 


0.5 


>21 days 


6 


Fungizone 


1.0 


>21 days 



[0242] As indicated in Table 8, compound 2(a) has in vivo antifungal activity 
similar to a dose of 0.25 mg/kg of Fungizone and increases 4-fold the median 
survival time of infected mice. 

[0243] The data (percent survival versus days post-inoculation) was plotted; the 
resulting graph is shown in Figure 16. 

Example 7. In Vitro Antitumor activity of Compound 2(a) 

[0244] In vitro antipoliferative study of Compound 2a was performed by the 
National Cancer Institute (National Institutes of Health, Bethesda, Maryland, 
USA) against a panel of cancer cell lines in order to determine the concentrations 
needed to obtain a 50% inhibition of cell proliferation (IC 5 o)- The operation of this 
unique screen utilizes 60 different human tumor cell lines, representing leukemia, 
melanoma, and cancers of the lung, colon, brain, ovary, breast prostate and 
kidney. Compound 2(a) was provided as a lyophilized powder with an estimated 
purity of 90+%. The compound was stored at -20°C until day of use. 
[0245] The human tumor cell lines of the cancer-screening panel were grown in 
RPMI 1640 medium containing 5% fetal bovine serum and 2 mM L-glutamine. 
For a typical screening experiment/cells were inoculated into 96 well microtiter 
plates in 100 ul at plating densities ranging from 5000 to 40,000 cells/well 
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depending on the doubling time of individual cell lines (Table 8). After cell 
inoculation, the microtiter plates were incubated at 37 °C, under 5% C0 2 , 95% air 
and 1 00% relative humidity for 24 hours prior to addition of the experimental 
drugs. 

[0246] After 24 hours, two plates of each cell line were fixed in situ with TCA, to 
represent a measurement of the cell population for each cell line at the time of 
drug addition (Tz). Compound 2(a) was solubilized in dimethyl sulfoxide at 400- 
fold the desired final maximum test concentration and stored frozen prior to use. 
At the time of drug addition, an aliquot of frozen concentrate was thawed and 
diluted to twice the desired final maximum test concentration with complete 
medium containing 50 ug/ml gentamicin. Additional four, serial dilutions were 
made to provide a total of five drug concentrations plus control. Aliquots of 100 
pi of these different drug dilutions were added to the appropriate microtiter wells 
already containing 1 00 pi of medium, resulting in the required final drug 
concentrations (2.5 x 10" 5 M to 2.5 x 10' 9 M). 

[0247] Following drug addition, the plates were incubated for an additional 48 
hours at 37°C, 5 % C0 2 , 95 % air, and 100 % relative humidity. For adherent 
cells, the assay was terminated by the addition of cold TCA. Cells were fixed in 
situ by the gentle addition of 50 pi of cold 50 % (w/v) TCA (final concentration, 1 0 
% TCA) and incubation for 60 minutes at 4°C. The supernatant was discarded, 
and the plates were washed five times with tap water and air-dried. 
Sulforhodamine B (SRB) solution (100 pi) at 0.4 % (w/v) in 1 % acetic acid was 
added to each well, and plates were incubated for 10 minutes at room 
temperature. After staining, unbound dye was removed by washing five times 
with 1 % acetic acid and the plates were air-dried. Bound stain was 
subsequently solubilized with 10 mM trizma base, and the absorbance was read 
on an automated plate reader at a wavelength of 51 5 nm. For suspension cells, 
the methodology was the same except that the assay was terminated by fixing 
settled cells at the bottom of the wells by gently adding 50 pi of 80 % TCA (final 
concentration, 16 % TCA). 
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[0248] The growth inhibitory power of compound 2(a) was measured by NCI 
utilizing the Gl 5 o value, rather than the classical IC 50 value. The Gl 50 value 
emphasizes the correction for the cell count at time zero and, using the seven 
adsorbance measurements [time zero (Tz), control growth (C), and the test 
growth in the presence of drug at each of the five concentration levels (Ti)], Gl 50 
is calculated as [(Ti - Tz) / (C - Tz) x 100 = -50. which is the drug concentration 
resulting in a 50% reduction in the net protein increase (as measured by SRB 
staining) in control ceHs during the drug incubation. The Gl 50 values for 
compound 2(a) for the various cell lines tested are presented in Table 9 below. 

Table 9: NCI Developmental Therapeutics Program In-Vitro Testing Results 
for Compound 2(a) 



Vm/vJII L_l 1 ic 


rai id l leu I It? 


11 lUlrUICLUUI 1 

density 

(no. of cells per 
well) 


v3'50 

(x 1 0" 6 , unless 
otherwise 
indicated) 


K-562 


Leukemia 


5000 


9.18 


MOLT-4 


Leukemia 


30,000 


5.57 


A549/ATCC 


Non-small cell 
lung cancer 


7500 


4.09 


EKVX 


Non-small cell 
lung cancer 


20,000 


5.87 


HOP-62 


Non-small cell 
lung cancer 


10,000 


6.83 


HOP-92 


Non-small cell 
lung cancer 


20,000 


9.77 x 10 a 


NCI-H226 


Non-small cell 
lung cancer 


20,000 


3.10 


NCI-H23 


Non-small cell 
lung cancer 


20,000 


4.25 


NCI-H322M 


Non-small cell 
lung cancer 


20,000 


3.48 


NCI-H460 


Non-small cell 
lung cancer 


7500 


3.83 


NCI-H522 


Non-small cell 
lung cancer 


20,000 


2.80 


COLO 205 


Colon cancer 


15,000 


5.00 
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HCC-2998 


Colon cancer 


15,000 


6.03 x 10° 


HCT-116 


Colon cancer 


5000 


4.18 


HCT-15 


Colon cancer 


10,000 


3.25 


HT29 


Colon cancer 


5000 


6.36 


KM12 


Colon cancer 


15,000 


2.76 


SW-620 


Colon cancer 


10,000 


5.35 


SF-268 


CNS cancer 


15,000 


3.64 


SF-295 


CNS cancer 


10,000 


3.91 


SNB-19 


CNS cancer 


15,000 


5.58 


SNB-75 


CNS cancer 


20,000 


3.87 


U251 


CNS cancer 


7500 


3.65 


LOX IMVI 


Melanoma 


7500 


3.73 


MALME-3M 


Melanoma 


20,000 


2.40 


M14 


Melanoma 


15,000 


4.15 


SK-MEL-2 


Melanoma 


20,000 


4.34 


SK-MEL-28 


Melanoma 


1.0,000 


6.75 


SK-MEL-5 


Melanoma 


10,000 


4.16 


UACC-257 


Melanoma 


20,000 


3.74 


UACC-62 


Melanoma 


10,000 


2.68 


IGROV1 


Ovarian cancer 


10,000 


2.95 


OVCAR-3 


Ovarian cancer 


10,000 


3.40 


OVCAR-4 


Ovarian cancer 


15,000 


4.48 


OVCAR-5 


Ovarian cancer 


20,000 


4.00 


OVCAR-8 


Ovarian cancer 


10,000 


4.34 


SK-OV-3 


Ovarian cancer 


20,000 


7.94 


786-0 


Renal cancer 


10,000 


3.07 


A498 


Renal cancer 


25,000 


4.82 


ACHN 


Renal cancer 


10,000 


2.96 


CAKI-1 


Renal cancer 


10,000 


2.99 


RXF 393 


Renal cancer 


15,000 


1.20 


SN12C 


Renal cancer 


15,000 


1.38 x 10"' 


TK-10 


Renal cancer 


15,000 


3.32 


UO-31 


Renal cancer 


15,000 


3.65 


PC-3 


Prostate cancer 


7500 


2.66 ; 


DU-145 


Prostate cancer 


10,000 


3.78 


MCF7 


Breast cancer 


10,000 


4.22 


NCI/ADR-RES 


Breast cancer 


15,000 


4.76 


MDA-MB- 


Breast cancer 


20,000 


3.38 
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MDA-MB-435 


Breast cancer 


15,000 


3.26 


BT-549 


Breast cancer 


20,000 


4.59 


T-47D 


Breast cancer 


20,000 


6.00 



[0249]The results indicate that compound 2(a) is effective against all the human 
tumor cell lines that have been assayed in the NCI screening panel suggesting a 
broad anticancer activity against several types of human cancer. In fact, the 
GI50 calculated for all cell lines was lower than 10 x10-6 M, a significant level of 
pharmacological activity for anticancer drugs, and in some cases reached the 
nanomolar or picomolar level (SN12C/renal carcinoma; HOP92/non-small cell 
lung carcinoma; HCC2998/colon carcinoma). 

Example 8 Activation of inactive domains in the polyketide synthase 
system 

OH 

HOy^OH 

OH OH OH OH OH OH OH OH OH OH O O CH 3 
• ....... . 




[0250]The gene cluster encoding the Compound 2(a) derived from Streptomyces 
aizunensis strain NRRL B-1 1277 is genetically modified to reactivate the 
ketoreductase (KR) domain, which is encoded in the ORF 13 module 12. This 
modification results in the conversion of the central carbonyl group adjacent to 
the sugar molecule of Compound 2(a), to a hydroxyl group (as shown in Figure 
12a). 

[0251] In the compound 2(a) locus, the KR domain present in ORF 13, module 12 
is inactive. To provide for the compound of Example 7 the KR domain is 
reactivated or swapped for an active KR domain. Reactivation of the KR domain 
requires diagnosis of the integrity of critical active site residues necessary for a 
functional KR domain. The active site residues can be divided into those required 
for co-enzyme activation of the KR enzyme and those for catalysis. Experiments 
identifying the specific residues for ketoreductase activity [Ried et. al. 
Biochemistry 2003, 42:72-79; Udo et.al., Biochemistry, 1997, 36:34-40] reveal 
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that functional KR coenzyme binding site residues include glycine (G), glycine 
(G), glycine (G), alanine (A) and the functional KR active site residues include 
serine (S), tyrosine (Y) and asparagine (N). These residues are highlighted in 
Figures 6a and 6b. The sequence of the KR domain in the compound 2(a) locus 
shows that the coenzyme active site residues are glycine (G), glycine (G), glycine 
(G), alanine (A) indicating that this site is indeed active. However, the amino acid 
residues found in the KR site responsible for catalytic activity are serine (S), 
glutamine (Q) and asparagine (N) indicating that the catalytic site is likely to be 
inactive. This observation is confirmed by the fact compound 2(a) contains a 
carbonyl group at that specific position (Figure 10, module 12). Modification of 
the codon encoding glutamine to a codon encoding tyrosine provides for an 
active site residue required for functional ketoreduction of PKS monomers. This 
results in an altered nucleic acid sequence of the compound 2(a) locus used to 
modify a suitable host cell to produce the compound 2(a) variant of Example 7 as 
shown in Figure 12a. 

[0252]The modification of glutamine to tyrosine may be introduced using a 
mismatched primer that hybridizes to the native nucleotide sequence at a 
temperature below the melting temperature of the mismatched duplex. The 
primer is kept specific by keeping primer length and base composition within 
narrow limits and keeping the mutant base centrally located as described in 
Zoller and Smith' Methods in Enzymol. (1983) 100:468. Primer extension is 
achieved using DNA polymerase. The product is cloned and positive clones 
containing the mutated DNA, derived by segregation of the primer extended 
strand, are selected. Selection is made using the mutant primer as a 
hybridization probe (Da|bie-McFarland et al Proc. Natl. Acad Sci. USA (1982) 
79:6409). 

[0253]Another method to generate the compound of Example 7 involves 
swapping the inactive ketoreductase domain from the gene locus of the 
compound 2(a) (ORF 13 module 12) with an active ketoreductase domain from 
the same or different locus. Example of domains within the same locus suitable 
for swapping include the active ketoreductases that occur in the modules that 
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encode the incorporation of methyl malonate extender units, namely ORF 16 
modules 19 or 20. Swapping of acyltransferase domains between PKS loci has 
been demonstrated by Oliynyk et.al. Chem Biol, 1996, 3(10):833-9, wherein the 
gene encoding the acyltransferase domain in 6-deoxyerythronolide (DEBS) 
module 1 is swapped with the gene encoding the rapamycin module 2 
acyltransferase resulting in the synthesis of novel triketides since the two 
acyltransf erases had different acyl specificities. In Hans et.al. J Am Chem Soc, 
2003, 125(18): 5366-74, the kinetic aspects of product formation as a 
consequence of acyltransferase domain swaps is taught. 
[0254] Swapping of domains is achieved using techniques developed by Kao 
et.al. Science, 1994, 265:509-512. The genetic strategy utilizes derivatives of 
pMAK705 to permit in vivo recombination between a temperature sensitive donor 
plasmid and a recipient shuttle vector by means of a double recombination event 
in E.coli. An Amp R Tc R recipient subclone of the regions flanking the domain to 
be swapped is made, pCK5, containing 1 kb of flanking sequence from either 
flank. Endonuclease restriction sites are introduced at the boundaries of the 
domain, Psfl at 3' end of the left flank and Xba\ at the 5' end of the right flank. 
Subclones pCK6 Cm R of the domains to be swapped are generated and 
endonuclease restriction sites are introduced into the boundaries of the domain. 
The restriction site Psfl is introduced at the 5' boundary of the KR domain and an 
Xba\ site at the 3' boundary of the domain. Restriction sites are introduced into 
subclones by PCR mutagenesis. The fragment containing the domain is excised 
and ligated into the temperature sensitive Cm R donor plasmid, pCK6. The 
recipient plasmid is generated by in vivo recombination of the plasmid in the host 
strain using the selection method outlined by Kao et.al., supra. After selection 
recombinant strains are produced with the domain of interest replacing the 
original domain. 
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Example 9 Inactivation of functional domains within the polyketide 
synthase syst m 

OH 
HOJ^OH 

OH OH OH OH OH OH OH OH OH OHO O CH 3 OH OH 

CH3 CH3CH3 

[0255]The gene locus encoding Compound 2(a) derived from a Streptomyces 
aizunensis strain is genetically modified to inactivate the enoyl reductase (ER) 
domain in the ORF 17 module 22. Inactivation of this domain abolishes the 
conversion the double bond to the single bond between the acyl units 
incorporated by modules 21 and 22 of Compound 2(a) (as shown in Figure 12e). 
[0256] Generating the compound of Example 8 is achieved through insertional 
inactivation by double crossover techniques developed by Oh and Chater, 1997, 
Journal of Bacteriology 179:122-127. Examples of insertional inactivation of 
genes involved in polyketide biosynthesis in Streptomyces are well known in the 
art. Arrowsmith et.al., 1992, Mol Gen Genet 234:254-264, used these techniques 
to identify the role of a cassette of secondary metabolic genes in the production 
of monensin by Streptomyces cinnamonensis. Paradkar, et.al., 2001 , Appl 
Environ Microbiol 67:2292-7, inactivated the /af gene encoding for lysine 
aminotransferase to disrupt the first step in the cephamycin pathway to block 
production of cephamycin C in Streptomyces clavuligerus. Similarly, these 
authors inactivated the cwr?1 gene involved in late stage antipodal clavam 
synthesis. 

[0257] Methods used to inactivate domains in polyketide systems include domain 
swapping as described in Example 7 as well as targeted disruption by insertional 
gene inactivation. For this, a replicative plasmid-mediated homologous 
recombination is applied to Streptomyces aizunensis. Plasmids for homologous 
recombination are constructed by cloning a kanamycin resistance marker 
between the left and right flanking regions of the genes to be modified. Such a 
construct is cloned into a delivery plasmid that is marked with thiostrepton 
resistance producing a disruption plasmid. This plasmid is introduced into 
Streptomyces aizunensis by either PEG-mediated protoplast transformation, by 
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electroporation or by natural infection with a phage (Keiser et a/ (2000) Practical 
Streptomyces genetics, John Innes Foundation, Norwich). The spores from 
individual transformants or transconjugants are cultured on non-selective plates 
to induce recombination. The cycle is repeated three times to enhance the 
opportunity for recombination. Crossovers yielding targeted gene recombinants 
are then selected and screened using kanamycin and thiostrepton for single 
crossovers and kanamycin for double crossovers. Replica plating and southern 
hybridization are used to confirm the double crossover inactivation (Keiser et al 
(2000) supra.). 

Example 10 Inactivation of the glycosyltransferase activity 



OHOH OHOHOHOH OHOHOHO OH 




[0258] Inactivation of the glycosyltransferase gene (GTFA) encoding ORF 9 of 
the compound 2(a) locus (as shown in Figure 12b) provides for the compound of 
this example. The inactivation of the GFTA disrupts the transfer of the sugar 
moiety onto the backbone of Compound 2(a). The absence of the sugar moiety 
results in a non-glycosylated form of Compound 2(a). Insertional inactivation of 
GTFA genes in polyketide biosynthesis in Streptomyces is known in the art. 
Blanco et.ai, 2000, Mol Gen Genet 262:991-1000, identified two genes of the 
mithramycin biosynthetic gene cluster as glycosyltransferases by the production 
of a non-glycosylated mithramycin upon inactivation of these genes. A similar 
observation was made by Chen et.ai, Gene, 2001 , 263:255-64 investigating 
genes responsible for glycosylation in the biosynthetic pathways encoding 
pikromycin, narbomycin, methymycin and neomethymycin. 
[0259] Targeted inactivation of the glycosyltransferase activity is achieved using 
the method of insertional gene disruption as described in Example 8. 
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Example 1 1 Elimination of the aminohydroxycyclopent n n unit 



H?N, 



OH 

HOyC,OH 

OH OH 9H9H9H9H 9H9HOHO 9 Js O'XH 3 OH OH 

CH 3 CH£H 3 



[0260] Elimination of the terminal aminohydroxycyclopentenone unit may be 
accomplished by inactivation of any one of the following three ORFs of the 
compound 2(a) locus. First, disruption of ORF 35 results in the inactivation of the 
acyjtransferase (AYTP) activity (as shown in Figure 12c) that abolishes 
condensation of succinyl-CoA and glycine to form 5-aminolevulinate. Second, 
disruption of ORF 36 results in the inactivation of acyl CoA ligase (CALB) 
preventing the conversion of 5-aminolevulinate to 5-aminolevulinate-CoA which 
cyclizes to form aminohydroxycyclopentenone. Third, disruption of ORF 34 
(ADSN) prevents transfer of the aminohydroxycyclopentenone unit to the 
polyketide chain. Thus, the compound of Example 10 is provided by genetically 
modifying at least one of ORFs 34, 35 and 36. Methods used for insertional 
inactivation of all three genes are described in Example 9. 



Example 12 Replacement of the terminal amine group with a guanidino 
group 



H 

f N. 
NH 



H 2 N V N, 



OH OH OH OH OH OH 



OH 
HOJ^OH 

OH OH OHO 9 Jv O^CH 3 
CH 3 



OH OH 

CH;CH 3 



[0261] The replacement of the terminal amine with a guanidino group may be 
accomplished by the insertional inactivation of ORF 33 (ADHY) using the 
methods described in Example 9. The inactivation of ORF 33 ADHY (as shown 
in Figure 12d) disrupts the synthesis of gamma-amino butanamide leading to the 
accumulation of 4-guanidino butanamide. The accumulated 4-guanidino 
butanamide is converted by ORF 27 CALB to 4-guanidino butyryl-CoA which is 
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then attached onto the polyketide synthase enzyme (ORF 10, module 0 as 
shown in Figure 10b) through the action of ORF 19 (AYTF). 



Example 13: Synthesis of Compound 2(b) by epoxidation of Compound 2(a) 




Compound 2(b) 

[0262] To a mixture of Compound 2(a) dissolved in tetrahydrofuran (THF) is 
added 1 equivalent of mete-chloroperbenzoic acid. The reaction is cooled in an 
ice bath and stirred at 0 °C for 1-2 hours. The reaction mixture is then 
evaporated to dryness, re-dissolved in methanol and subjected to liquid 
chromatography on a column of Sephadex LH-20 to isolate the Compound 2(b). 
[0263] The epoxide group of Compound 2(b) may be hydrolyzed by treatment of 
Compound 2(b) with small quantity of aqueous hydrochloric acid (1.0 N), thereby 
forming the corresponding diol of the formula: 




Example 14 : Synthesis of Compound 2(c) by Reduction of 31-oxo group 




Compound 2(c) 

[0264]A solution of Compound 2(a) in acetonitrile is treated with 1 .5 equivalents 
of NaCNBH 3 . The reaction is stirred at room temperature for 1 hour. The 
reaction mixture is then concentrated to dryness and then taken up into 
methanol. The mixture is filtered and the filtrate is subjected to liquid 
chromatography on a column of Sephadex LH-20 to isolate the Compound 2(c). 
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Alternatively, the reduction of the oxo group at the 31 -position may be done using 
lithium borohydride (LiBH 4 ). 



Example 15 : Synthesis of Compound 2(d) by addition of acetal ring at the 
31 -position 




Compound 2(d) 

[0265] A solution of Compound 2(a) in tetrahydrofuran is treated with 3 
equivalents of 2,2-dimethyl-1 ,3-dioxacyclopentane in the presence of a trace 
amount of toluene sulfonic acid. The reaction is stirred overnight at room 
temperature, evaporated to dryness and taken up into dry THF, followed by 
purification by liquid chromatography on a column of Sephadex LH-20. The 2,2- 
dimethyl-1 , 3-dioxacyclopentane may be synthesized by reaction of acetone with 
ethylene glycol in the presence of a trace of toluene sulfonic acid, over molecular 
sieves to remove water. 

[0266] Alternatively, the addition of an acetal ring at the 31 -position may be 
accomplished by reaction of Compound 2(a) with an excess of ethylene glycol in 
the presence of a trace of toluene sulfonic acid. The reaction may be conducted 
Over molecular sieves to remove water. 



Example 16 : Synthesis of Compound 2(e) 




Compound 2(e) 

[0267] To a solution of Compound 2(a) in benzene or toluene is added 1 0 
equivalents of benzylamine. The reaction is stirred at room temperature 
overnight. The reaction may be conducted over molecular sieves to remove 
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water; alternatively, the water may be removed under reflux as an azeotrope with 
benzene or toluene using a Dean-Stark trap. The reaction mixture is 
concentrated under vacuum and residual reagent is removed by high vacuum at 
room temperature overnight. 

[0268] The carbon-nitrogen double bond of Compound 2(e) may be reduced to 
the amine by reaction of Compound 2(e) with NaCNBH 3 or LiBH 4 (1.5 
equivalents) in acetonltrile, to form a compound of the structure: 




CH 3 " CH 3 CH 3 



Example 17 : Synthesis of Compound 2(f) 




Compound 2(f) 

[0269] To a solution of one equivalent of Compound 2(a) in acetonitrile is added 
ten equivalents of isobutylamine. The reaction is stirred at room temperature for 
two hours. Benzene (1/1 0 volume) is added and the mixture is concentrated to 
dryness under vacuum on a rotary evaporator. 

[0270]The Schiff base is then treated with NaCNBH 3 or LiBH 4 (1 .5 equivalents) in 
acetonitrile, to reduce the carbon-nitrogen double bond of the imine to the amine, 
to form the compound 2(f). 



Example 18 : Synthesis of Compound 2(g) 




Compound 2(g) 
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[0271] Compound 2(g) may be synthesized biosynthetically as described in 
Example 9. Alternatively, Compound 2(g) may be prepared by hydrolysis of 
Compound 2(a). This is accomplished by treatment of Compound 2(a) in 
diethylether/THF with Meerwein's reagent (triethyloxonium tetrafluoroborate) for 
two hours at room temperature followed by cooling to -20 °C and dropwise 
addition of aqueous acetic acid in THF. The reaction mixture is stirred for 20 
minutes during which time it is allowed to come to room temperature. The 
mixture is then diluted with water (2 volumes) and HP-20 polystyrene resin is 
added. The mixture is stirred for 30 minutes, filtered, the resin is washed well 
with water, and the product is eluted with 1 00% ethanol. The elutes are 
concentrated under vacuum to give compound 2(g). 



Example 19 : Synthesis of Compound 2(h) 




Compound 2(h) 



[0272]To a solution of 0.1 equivalents of Compound 2(g) in methanol is added 
0.5 equivalents of diazomethane in diethyl ether. The reaction mixture is allowed 
to stand at room temperature overnight, and then the solvent is removed under 
vacuum to give compound 2(h). 



Example 20 : Synthesis of Compound 2(i) 




Compound 2(i) 

[0273]A solution of Compound 2(a) in methanol is treated with an equal volume 
of 0.1 N HCI, and the reaction mixture is stirred overnight at room temperature. 
The mixture is then diluted with water (2 volumes) and HP-20 polystyrene resin is 
added. The mixture is stirred for 30 minutes, filtered, the resin is washed well 
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with water, and the product is eluted with 100% ethanol. The elutes are 
concentrated under vacuum to give compound 2(i). 



Example 21 : Synthesis of Compound 2(j) 




Compound 2(j) 



[0274] Compound 2(j) is prepared by hydrolysis of compound 2(g). The 
hydrolysis may carried out in the same way that compound 2(a) is hydrolysed to 
compound 2(i) as described in Example 19 above. 



Example 22 : Synthesis of Compound 2(k) 




Compound 2(k) 

[0275] Compound 2(k) is prepared biosynthetically by inactivation of the enoyl 
reductase as described in Example 8. 



Example 23 : Synthesis of Compound 2(1) 




Compound 2(1) 



[0276] A solution of Compound 2(k) in acetonitrile is treated with 1 .5 equivalents 
of NaCNBH 3 . The reaction is stirred at room temperature for 1 hour. The 
reaction mixture is then concentrated to dryness and then taken up into 
methanol. The mixture is filtered and the filtrate is subjected to liquid 
chromatography on a column of Sephadex LH-20 to isolate the Compound 2(l). 
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Alternatively, the reduction of the oxo group at the 31 -position may be done using 
lithium borohydride (LiBH 4 ). 



Example 24 : Synthesis of Compound 2(m) 




Compound 2(m) 



[0277] A solution of 10 equivalents of Compound 2(a) in acetonitrile is treated 
with one equivalent of acetaldehyde. The reaction is stirred at room temperature 
for two hours. Benzene (1/10 volume) is added and the mixture is concentrated 
to dryness under vacuum on a rotary evaporator to give the compound 2(m). 
[0278] Compound 2(m) may be treated with NaCNBH 3 or LiBH 4 (1.5 equivalents) 
in acetonitrile, to reduce the carbon-nitrogen double bond of the imine to the 
amine. 

Example 25 : Synthesis of Compound 2(n) 



OH 




Compound 2(n) 



[0279] A solution of 1 0 equivalents of Compound 2(a) in acetonitrile is treated 
with one equivalent of benzaldehyde. The reaction is stirred at room temperature 
for two hours. Benzene (1/10 volume) is added and the mixture is concentrated 
to dryness under vacuum on a rotary evaporator to give the compound 2(n). 
[0280] Compound 2(n) may be treated with NaCNBH 3 or LiBH 4 (1.5 equivalents) 
in acetonitrile, to reduce the carbon-nitrogen double bond of the imine to the 
amine. 
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Example 26 : Synth sis of Compound 2(o) 




Compound 2(o) 

[0281] A solution of Compound 2(a) in tetrahydrofuran is treated with one 
equivalent of cyanamide. The reaction mixture is stirred at room temperature 
overnight. Solvent is removed from the reaction mixture under vacuum to give 
compound 2(o). 



Example 27 : Synthesis of Compound 2(p) 




Compound 2(p) 

[0282] Jo a solution of 10 equivalents of Compound 2(a) in acetonitrile is added 1 
equivalent of acetone. The reaction is stirred at room temperature for two hours. 
Benzene (1/10 volume) is added and the mixture is concentrated to dryness 
under vacuum on a rotary evaporator. 

[0283]The resulting Schiff base imine is then treated with NaCNBH 3 or LiBH 4 (1 .5 
equivalents) in acetonitrile, to reduce the carbon-nitrogen double bond of the 
imine to the amine, to form the compound 2(p). 



Example 28 : Synthesis of Compound 2(q) 




Compound 2(q) 
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[0284]To a solution of 10 equivalents of Compound 2(a) in acetonitrile is added 1 
equivalent of 4-nitrobenzaldehyde. The reaction is stirred at room temperature 
for two hours. Benzene (1/10 volume) is added and the mixture is concentrated 
to dryness under vacuum on a rotary evaporator. 

[0285]The resulting Schiff base imine is then treated with NaCNBH 3 or LiBH 4 (1 .5 
equivalents) in acetonitrile, to reduce the carbon-nitrogen double bond of the 
imine to the amine, to form the compound 2(q). 



Example 29 : Synthesis of Compound 2(r) 




Compound 2(r) 

[0286]To a solution of 1 0 equivalents of Compound 2(a) in acetonitrile is added 1 
equivalent of cyclohexylformaldehyde. The reaction is stirred at room 
temperature for two hours. Benzene (1/10 volume) is added and the mixture is 
concentrated to dryness under vacuum on a rotary evaporator. 
[0287]The resulting Schiff base imine is then treated with NaCNBH 3 or LiBH 4 (1 .5 
equivalents) in acetonitrile, to reduce the carbon-nitrogen double bond of the 
imine to the amine, to form the compound 2(r). 



Example 30 : Synthesis of Compound 2(s) 




Compound 2(s) 

[0288]To a solution of Compound 2(a) in tetrahydrofuan is added one equivalent 
of acetic anhydride and two equivalents of triethylamine. The reaction is stirred 
at room temperature for two hours. The mixture is then diluted with water (2 
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volumes) and HP-20 polystyrene resin is added. The mixture is stirred for 30 
minutes, filtered, the resin is washed well with water, and the product is eluted 
with 100% ethanol. The elutes are concentrated under vacuum to give 
compound 2(s). 

Example 31 : Synthesis of Compound 2(t) 




Compound 2(t) 

[0289] To a solution of Compound 2(a) in is added one equivalent of isobutyrl 
anhydride and two equivalents of triethylamine. The reaction is stirred at room 
temperature for two hours. The mixture is then diluted with water (2 volumes) 
and HP-20 polystyrene resin is added. The mixture is stirred for 30 minutes, 
filtered, the resin is washed well with water, and the product is eluted with 100% 
ethanol. The elutes are concentrated under vacuum to give compound 2(t). 



Example 32 : Synthesis of Compound 2(u) 




Compound 2(u) 

[0290] To a solution of Compound 2(a) in is added one equivalent of benzoic 
anhydride and two equivalents of triethylamine. The reaction is stirred at room 
temperature for two hours. The mixture is then diluted with water (2 volumes) 
and HP-20 polystyrene resin is added. The mixture is stirred for 30 minutes, 
filtered, the resin is washed well with water, and the product is eluted with 1 00% 
ethanol. The elutes are concentrated under vacuum to give compound 2(u). 
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Exampl 33 : Synthesis of Compound 2(v) 




Compound 2(v) 

[0291] To a solution of Compound 2(a) in is added one equivalent of p- 
nitrobenzoic anhydride and two equivalents of triethylamine. The reaction is 
stirred at room temperature for two hours. The mixture is then diluted with water 
(2 volumes) and HP-20 polystyrene resin is added. The mixture is stirred for 30 
minutes, filtered, the resin is washed well with water, and the product is eluted 
with 1 00% ethanol. The elutes are concentrated under vacuum to give 
compound 2(v). 



Example 34 : Synthesis of Compound 2(w) 




Compound 2(w) 

[0292] A solution of Compound 2(a) is reacted with 1 equivalent of N-protected 
alanine active ester. The amino group of alanine is protected by reacting alanine 
with DCC (dicyclohexyldicarbodiimide) or EDC (1-ethyl-3-(3- 
dimethylaminopropyl)-carbodiimide) and the carboxylic acid group is converted to 
an active ester such as an N-hydroxysuccinimide ester. The N-protected active 
ester is added to Compound 2(a) in an inert solvent such as tetrahydrofuran. 
The mixture is warmed under reflux for one hour. The mixture is then diluted with 
water (2 volumes) and HP-20 polystyrene resin is added. The mixture is stirred 
for 30 minutes, filtered, the resin is washed well with water, and the product is 
eluted with 100% ethanol. The. elutes are concentrated under vacuum to give 
compound 2(w). 
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Example 35 : Synth sis of Compound 2(x) 




Compound 2(x) 



[0293] A solution of Gompound 2(a) is reacted with 1 equivalent of N-protected 
para-hydroxyphenyl glycine active ester. The amino group of the para- 
hydroxyphenyl glycine is protected by reacting alanine with DCC 
(dicyclohexyldicarbodiimide) or EDC (1-ethyl-3-(3-dimethylaminopropyl)- 
carbodiimide) and the carboxylic acid group is converted to an active ester such 
as an N-hydroxysuccinimide ester. The N-protected active ester is added to 
Compound 2(a) in an inert solvent such as tetrahydrofuran. The mixture is 
warmed under reflux for one hour. The mixture is then diluted with water (2 
volumes) and HP-20 polystyrene resin is added. The mixture is stirred for 30 
minutes, filtered, the resin is washed well with water, and the product is eluted 
with 100% ethanol. The elutes are concentrated under vacuum to give 
compound 2(x). 



Example 36 : Synthesis of Compound 2(y) 




Compound 2(y) 



[0294] A solution of Compound 2(a) is reacted with 1 equivalent of N-protected 
tyrosine active ester. The amino group of tyrosine is protected by reacting 
alanine with DCC (dicyclohexyldicarbodiimide) or EDC (1-ethyl-3-(3- 
dimethylaminopropyl)-carbodiimide) and the carboxylic acid group is converted to 
an active ester such as an N-hydroxysuccinimide ester. The N-protected active 
ester is added to Compound 2(a) in an inert solvent such as tetrahydrofuran. 



130 



3004-9US 



The mixture is warmed under reflux for one hour. The mixture is then diluted with 
water (2 volumes) and HP-20 polystyrene resin is added. The mixture is stirred 
for 30 minutes, filtered, the resin is washed well with water, and the product is 
eluted with 100% ethanol. The elutes are concentrated under vacuum to give 
compound 2(y). 



Example 37 : Synthesis of Compound 2(z) 




Compound 2(z) 



[0295] A solution of Compound 2(a) is reacted with 1 equivalent of N-protected 
valine active ester. The amino group of valine is protected by reacting alanine 
with DCC (dicyclohexyldicarbodiimide) or EDC (1-ethyl-3-(3- 
dimethylaminopropyl)-carbodiimide) and the carboxylic acid group is converted to 
an active ester such as an N-hydroxysuccinimide ester. The N-protected active 
ester is added to Compound 2(a) in an inert solvent such as tetrahydrofuran. 
The mixture is warmed under reflux for one hour. The mixture is then diluted with 
water (2 volumes) and HP-20 polystyrene resin is added. The mixture is stirred 
for 30 minutes, filtered, the resin is washed well with water, and the product is 
eluted with 100% ethanol. The elutes are concentrated under vacuum to give 
compound 2(z). 



Example 38 : Synthesis of Compound 2(aa) 




Compound 2(aa) 

[0296] A solution of Compound 2(a) is reacted with 1 equivalent of N-protected 
proline active ester. The amino group of proline is protected by reacting alanine 
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with DCC (dicyclohexyldicarbodiimide) or EDC (1-ethyl-3-(3- 
dimethylaminopropyl)-carbodiimide) and the carboxylic acid group is converted to 
an active ester such as an N-hydroxysuccinimide ester. The N-protected active 
ester is added to Compound 2(a) in an inert solvent such as tetrahydrofuran. 
The mixture is warmed under reflux for one hour. The mixture is then diluted with 
water (2 volumes) and HP-20 polystyrene resin is added. The mixture is stirred 
for 30 minutes, filtered, the resin is washed well with water, and the product is 
eluted with 100% ethanol. The elutes are concentrated under vacuum to give 
compound 2(aa). 



Example 39 : Synthesis of Compound 2(ab) 




Compound 2(ab) 



[0297] A solution of Compound 2(a) is reacted with 1 equivalent of N-protected 
serine active ester. The amino group of serine is protected by reacting alanine 
with DCC (dicyclohexyldicarbodiimide) or EDC (1-ethyl-3-(3- 
dimethylaminopropyl)-carbodiimide) and the carboxylic acid group is converted to 
an active ester such as an N-hydroxysuccinimide ester. The N-protected active 
ester is added to Compound 2(a) in an inert solvent such as tetrahydrofuran. 
The mixture is warmed under reflux for one hour. The mixture is then diluted with 
water (2 volumes) and HP-20 polystyrene resin is added. The mixture is stirred 
for 30 minutes, filtered, the resin is washed well with water, and the product is 
eluted with 100% ethanol. The elutes are concentrated under vacuum to give 
compound 2(ab). 

Example 40 : Compound 2(a) for the treatment of cardiovascular disorders 

[0298] Polyene compounds are not generally absorbed from the gastrointestinal 
tract and exhibit hypocholesterolemic properties by binding cholesterol in the 
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gastrointestinal tract following oral administration. The hypocholesterolemic 
properties of polyene compounds was first demonstrated by studies in dogs 
(Schaffner.C.P. and Gordon H.W. The hypocholesterolemic activity of orally 
administered polyene macrolides. P.N.A.S. 61:36-41 , 1968.). In another study 
with chickens, small amounts of polyene compounds in the diet led to the 
inhibition of enterohepatic cholesterol circulation, increased fecal lipid excretion 
and reduced atherogenesis (Fisher, H., Griminger P. and Siller W. Effect of 
candicidin on plasma cholesterol and avian atherosclerosis. Proceedings of the 
Society for Experimental Biology and Medicine, 145: 836-839, 1974). The 
beneficial effects of orally administered polyene compounds on cholesterol-lipid 
metabolism is not species-dependent as it was demonstrated in several species 
including humans, rats, dogs and chickens (Pagliano FM, Correction of 
hyperdyslipidemia using polyene-structure substances. Controlled clinical trial. 
Arch Sci Med (Torino). 136: 303-308, 1979; Barbara A. and Casella G. Action of 
a polyene macrolide on hyperdislipidaemic disorders. Archivio per Scienze 
Mediche 137: 211-216, 1980; Singhal, A.K., Mosbach, E.H. and Schaffner, CP. 
Effect of candicidin on cholesterol and bile acid metabolism in the rat. Lipids, 1 6: 
423-426,1981.). 

[0299]The therapeutic potential of compound 2(a) for the treatment of 
cardiovascular disorders such as high cholesterol, dyslipidemia and 
atherosclerosis is demonstrated by measuring the effects of oral administration of 
compound 2(a) to rabbits. New Zealand rabbits are maintained under controlled 
light and temperature conditions and fed for several weeks with two different 
diets: normal rabbit chow (control) and a diet containing 0.5 to 1% cholesterol to 
induce hypercholesterolemia. Rabbits are administered compound 2(a) (3, 10, 30 
mg/kg) or vehicle by oral gavage daily for up to one month. Food intake and 
rabbit weight is measured daily for the duration of the experiment. Blood samples 
to measure cholesterol, lipoproteins and triglycerides are collected through a 
catheter inserted in the ear artery in the beginning and at the end of the 
experiment as well as every 4 days for the duration of the experiment. Serum 
cholesterol, lipoproteins and triglycerides are measured by enzymatic assays 
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employing commercial kits as specified by the manufacturer (Sigma Chemical 
Co) and as described in Staprans I, Pan X-M, Rapp JH, Feingold KR. Oxidized 
cholesterol in the diet accelerates the development of aortic atherosclerosis in 
cholesterol-fed rabbits. Arteriosclerosis, Thrombosis and Vascular Biology, 18: 
977-983, 1998. At the end of the experiment, after collecting the final blood 
sample, animals are anesthetized and the descending aorta is exposed, excised 
and processed for histological examination following fixation in formalin. Briefly, 
paraffin longitudinal or cross sections (five micron) are stained with Sudan black 
(dying lipids) and counterstained with Masson trichrome. Morphometric 
quantitative determination of the area of the intima, media and adventitia layers is 
performed by image analysis. Lipid deposition in the aorta is determined by 
evaluation of the percentage of the aorta covered by lesions visualized by fat 
staining. Arterial concentration of cholesterol is measured after extraction of lipids 
as described in Thiery J, Nebendahl K, Rapp K, Kluge R, Teupser D and Seidel 
D. Low atherosclerotic response of a strain of rabbits to diet-induced 
hypercholesterolemia. Arteriosclerosis, Thrombosis and Vascular Biology, 15: 
1181-1188,1995. 
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